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INTEGRATED  DETECTION.  ESTIMATION  AND  COMMUNICATION  THEORIES  # 


The  objectives  of  this  program  woe  to  investigate  the  synogies  among  the  decision, 
estimation  and  communication  aspects  of  a  distributed  multisensor  system.  The  effort  in  this 
project  was  hence  concentrated  primarily  on  the  development  of  a  coherent  framework  for 
data  fusion  and  the  development  of  a  coherent  theory  of  distributed  decision  capable  of 
incorporating  estimation  and  communication  aspects. 

A  fair  amount  of  effort  was  focused  on  the  development  of  a  distributed  decision  fusion 
theory.  In  this  context  a  Neyman-Pearson  type  thMiy  was  developed  for  the  distributed 
decison  fusion  problem.  The  theory  has  been  developed  for  the  binary  hypothesis  testing 
problem  with  both  binary  and  M-aiy  quantized  decisions  at  the  local  (sensor)  level.  The 
thoery  has  established  that,  under  statistical  independence,  the  optimal  fusion  configuration 
consists  of  binary  (or  M-ary)  level  likelihood  quantizers  at  the  sensor  level,  and  a  binary 
Neyman-Pearson  test  at  the  fusion.  Variants  of  this  optimal  Neyman-Pearson  solution  have 
been  investigated  and  the  optimal  (in  the  Bayesian  or  N-P  sense)  solutions  were  obtained  in 
the  presence  of  propagation  delays  in  the  transmission  of  the  decisions  from  the  sensor  to 
fusion,  presence  of  eirw  in  the  fused  data,  and  in  the  presence  of  senstx^  misalignment  and 
communication  constraints  in  the  provision  of  information. 

Other  issues  involved  in  the  design  of  a  distributed  decision  fusion  system,  such  as 
intersensor  correlation  and  multiresolution  detection  have  been  investigated. 

A  large  number  of  publications  have  been  emerged  from  this  project  and  have  appeared  in 
scattered  journals  or  conference  proceedings.  A  san^ie  of  a  few  publiacdons  is  attached. 

The  success  of  this  program  has  led  to  the  teaming  of  the  P.I.  with  Calspan  and  Grumman 
Cooperations  and  the  submission  of  a  proposal  for  Pn  Detection  Fusion  to  Rome  ADC.  The 
success  of  the  Pie  Detection  Fusion  program.  The  contract  was  awarded  to  our  team.  The 
project  has  ended  successfully.  The  acquired  experience  from  this  project,  the  first  contolled 
environment  data  fusion  project,  has  bera  invaluable. 

The  basis  of  distributed  decision  theory  has  been  expanded  to  more  genral  fusion  concepts. 
As  a  result,  a  Generalized  Evidence  Processing  (GEP)  theory  was  developed.  The  developed 
theory  attempts  to  reconciliate  the  Bayesian  with  the  Dempster-Shafer  theory.  Numerical 
comparisons  between  the  GEP  and  conventional  distributed  fusion  algorithnis,  highlight  the 
superior  performance  of  GEP  as  compared  to  the  conventional  distibuted  decision  theory. 

A  list  of  publications  that  resulted  from  this  project,  a  list  of  recent  publications  that  relate  to 
this  project  directly,  and  a  sample  of  die  main  publications  that  emerged  from  the  project 
follow. 
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List  of  publicatioitt  from  this  project 


A.  Refereed  Journals  and  Proceedings 

S.  C.  A.  Thomopoulos,  R.  Viswanathan  and  D.  K.  Bougouiias,  "Optimal  Decision  Fusion  in 
Multiple  Sensor  Systems."  Transactions  on  Aerospace  and  Electronic  Systems.  Vol.  23, 
No.  5.  Sept  1987.  pp.  644-653. 


R.  Viswanathan.  S.  C.  A.  Thomopoulos.  and  R.  Tumuluri.  "Optimal  Serial  Distributed 
Decision  Fusion,"  ifff-  Transactions  on  Aerospace  and  Electronic  Systems.  VoL  24.  No.  4. 
July  1988,  pp.  366-376. 

S.  C.  A.  Thomopoulos,  R.  Viswanathan,  and  D.  K.  Bougouiias,  "Optimal  Distributed  Decision 
Fusion,"  TFF.F.  Transactions  on  Aerospace  and  Electronic  Systems.  Vol.  25,  No.  5,  Sept  1989, 
pp.  761-764. 


S.  C.  A.  Thomopoulos,  "Senstn’  Integration  and  Data  Fusion,"  Invited  paper  in  special  issue 
(Mt  Sensor  Integration  and  Data  Fusion  for  Robotic  Systems,  Journal  of  Robotic  Systems.  VoL 
7.  No.  3.  1990,  pp.  337-372. 


S.  C.  A.  Thomopoulos  and  N.  Okello,  "Distributed  Detection  with  Consulting  Sensors  and 
Communication  Cost,"  TEFF  Transactions  on  Automatic  Control.  Vol.  37,  no.  9,  September 
1992,  pp.  1398-1405. 

S.  C.  A.  Thotttopoulos  and  L.  Zhang,  "Distributed  Decision  Fusicm  with  Networking  Delays 
and  Channel  Errors,"  Information  Sciences:  An  International  Journal  Vol.  66,  nos.  1  &  2, 
December  1,  1992,  pp.  91-118. 

S.  C.  A.  Thomqpoulos  and  N.  N.  Okello,  "Distributed  and  Centralized  Multi-Sensor  Detection 
with  Misaligned  Sensors,"  Information  Sciences:  An  International  Journal,  to  appear. 


I.  N.  M.  Papadalds  and  S.  C.  A.  Thomopoulos,  "Hypothesis  Testing  using  Structured 
Networks,”  iFFF  Transactions  on  Autcwnaric  Control,  to  appear. 


S.  C.  A.  Thomopoulos,  D.  K.  Bougouiias  and  C.-D.  Wann,  "Dignet:  An  Unsupervised 
Guitering  Algorithm  for  Centralized  and  Distributed  Pattern  Recognition,  Classification,  and 
Hypothesis  Testing,"  IEEE  Transactions  on  Aerospace  and  Electronic  Systems,  to  appear. 
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S.  C.  A.  Thomopoulos,  R.  Viswanathan,  and  D.  P.  Bougoulias,  "Optimal  Decision  Fusion  in 
Multiple  Sensor  Systems,"  24th  AUenon  Conference.  Allerton  House,  Monticello,  Oct  1- 
3,  1986. 

R.  Viswanathan,  S.  C.  A.  Thomopoulos,  and  V.  Aalo,  "Distributed  Detection  with  Conelated 
Sensor  Noise.  25th  Allerton  Conf^nce.  Allerton  House,  Monticelli,  IL,  Sept  30-Oct  2, 

1987. 

V.  Aalo,  R.  Viswanathan,  and  S.  C.  A.  Thomopoulos,  "A  Study  of  Distributed  Detection 
with  Correlated  Sensor  Noise,"  Proceedings  nf  IFF-F-  GLOBECOM  *87.  Tokyo,  Japan,  Nov. 
15-18,  1987. 

S.  C.  A.  Thomopoulos,  "Optimal  and  Subopdmal  Decision  Fusion,  35tfa  SIAM  Meeting. 
Society  for  Industrial  and  Applied  Mathematics,  Denver.  Colorado,  Oct  12-15,  1987. 

S.  C.  A.  Thomopoulos,  L.  Zhang,  and  R.  Viswanathan,  "Distributed  Detectkm  and 
Netwmldng."  Symposium  on  Innovative  Science  and  Technology.  SPIE  *88.  Los  Angeles. 

CA,  January  10-15,  1987. 

R.  Viswanathan.  S.  C.  A.  Thomopoulos,  and  R.  Tumuluri,  "Sequential  Decision  in  Multiple 
Sensor  Fusion,"  Proceedings  of  21st  CISS.  The  Johns  Ht^ldns  Uniyersity,  March  25-27, 

1987. 

S.  C.  A.  Thomopoulos,  and  N.  Okello,  "Distributed  Detection  with  Consulting  Sensors  and 
Communication  Cost"  SPlE’s  1988  Technical  Symposium  on  Optics.  Electro-Qptics  and 
Sensors.  Orlando,  FL,  April  4-8,  1988. 

S.  C.  A.  Thomopoulos,  and  L.  Zhang,  "Distributed  Decision  Fusion  with  Networking  Delays 
and  Channel  Errors,"  SPIE  Proceedings.  Vol.  931,  Senses  Fusion,  (1988),  pp.  154-160. 

S.  C.  A.  Thomopoulos  and  L.  Zhang,  "Distributed  Hltering  with  Random  Sampling  and 
Delay,"  SPIE  Proceedings.  VoL  931,  Sensor  Fusion.  (1988),  pp.  31-40. 

S.  C.  A.  Thomopoulos,  D.  K.  Bougoulias,  and  L.  Zhang,  "Optimal  and  Subqttimal  Distributed 
Decision  Fusion,  SPIE  Proceedings.  Vol.  931,  Sensor  Fusion,  (1988),  pp.  26-30. 

S.  C.  A.  Thomt^toulos,  R.  Viswanathan  and  D.  K.  Bmigoulias,  "Optimal  and  Suboptimal 
Distributed  Decision  Fusion."  22nd  Annual  Conference  on  Infcrmfliii;^  Sciences  and  Systems. 
Princeton  Uniyersity,  March  16-18,  1988. 

S.  C.  A.  Thomopoulos  and  N.  Okello,  "Distributed  Detection  with  Consulting  Sensors  and 
Communication  Cost"  22nd  Annual  Conference  on  Infwmation  Sciences  and  Systems. 
Princeton  Uniyersity,  March  16-18, 1988. 
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S.  C.  A.  Thoiix>poul<M  and  L.  Zhang.  "Netwcvking  in  Distributed  Decision  Fusion,"  American 
Control  Conference  ACC  *88.  Atlanta.  GA.  June  15-17.  1988. 


S.  C.  A.  Thomopoulos  and  N.  Okello,  "Decision  Fusion  with  Consulting  Sensors,"  American 
Control  ConfereiKC  ACC  *88.  Adanta,  GA,  June  15-17,  1988. 

S.  C.  A.  Thomopoulos  and  N.  Okello,  "Distributed  Detection  with  Mismatched  Sensors," 

SPIE  1988  Cambridge  Symposium  on  Advances  in  Intelligent  Robotics  Systems.  Boston  MA, 
October  1988. 

S.  C.  A.  Thomopoulos  and  L.  Zhang.  "Distributed  Filtering  with  Random  Sampling  and 
Delay,"  Conference  on  Decision  and  Control.  CPC  *88.  Austin,  Texas.  December  7-10,  1988. 


S.  C.  A.  Thomqwulos,  "Sensor  Integration  and  Data  Fusion,"  SPIE  Proceedings  on  Advances 
in  InteUigent  Robotics  Systems.  Sensor  Fusion  H:  Human  and  Machine  Strategies.  Vol.  1198, 
pp.  178-191. 

S.  C.  A.  Thomopoulos  and  L.  Nilsson,  "Object  Tracking  for  Sequences  of  Images  Using 
Sterea  Camera,"  SPIE  Proceedings  on  Advances  in  Intelligent  Robotics  .Svittemit. 

Fusion  O!  Htfman  and  Machine  Strategies.  Vol.  1198,  pp.  156-169. 

S.  C.  A.  Thomopoulos.  "ThetHies  in  Distributed  Decision  Fusion:  Cmnpahstm  and 
Generalization,"  SPIE  Proceedings.  VoL  1383,  Sensew  Fusion  ni,  Nov.  1990. 

S.  C.  A.  Thomopoulos  and  D.  K.  Bougoulias,  "DIGNET:  A  Self-Organizing  Neural  Network 
for  Automatic  Pattern  Recognition  and  aassification,"  Proceeding  of  CISS.  Johns  Hopkins. 
March  1991. 

S.  C.  A.  Thomopoulos  and  D.  K.  Bougoulias,  "DIGNET:  A  Self-Organizing  Neural  Network 
for  Automatic  Pattern  Recognition  and  Qassification,"  SPlE’s  OE/Aerospace  Sensing. 
Orlando,  FL,  1-5  April,  1991. 
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S.  C.  A.  Thomopoulos.  "Decision  and  Evidence  Fusion  in  Sensor  Integration." 
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of  S,  pp.  339-412,  Academic  Press,  Nov.  1991. 
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Fusion  for  InteUiaent  Machines  and  Systems.  Editors  R.  C.  Luo  and  M.  G. 
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1  abstract  (CenCMWt  «n  revRtM  tf  nRcawary  and  tdanwty  by  btocb  numbtrl 


The  objectives  of  (his  program  were  to  investigaie  the  synergies  among  the  decision,  estimaiion  and 
communicadon  aspects  of  a  distributed  multisensor  system.  Hence,  the  effort  in  this  project  was  primarily 
concentrated  in  the  develofiment  of  a  coherent  frameworic  that  would  allow  the  development  of  a  coherent  theory 
of  distributed  decision  and  incoipatate  estimaiion  and  communication  aspects.  In  this  conlexL  a  Neyman- 
Pearson  theory  for  dis&ibuied  decison  fusion  was  developed.  The  effect  of  communcations  and  topological 
aspects  in  the  structure  and  performance  of  the  opitmal  distributed  decision  fusion  have  been  investigated.  The 
opitmal  distribuied  Neyman-Pearson  decision  fusion  has  been  derived  for  the  ideal  case,  and  in  cases  where 
transmission  delays,  channel  erors.  and  sensor  misalignment  are  present  Other  issues  involved  in  the  design  of  a 
distribuied  decision  fusion  system,  such  as  int'  rsensor  correlation  and  multiresolulion  detection  have  also  been 
investigated.  A  Generalized  Evidence  Processing  theory  that  extends  and  to  ceitain  exKnd  unifies,  the  Bayesian 
and  Dempster-Shafer  theories  has  been  developed.  A  systematic  fiameworfc  for  the  dau  fusion  analysis  and 
synthesis  has  also  been  developed  and  tested  with  experimental  dau  successfully. 
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I.  INTRODUCTION 


Optimal  Decision  Fusion  in 
Multiple  Sensor  Systems 

STELIOS  C..\.  THOMOPOl'LOS.  Memter.  IEEE 
RAMA.NARAVA.NAN  VISWANATHAN.  Member.  IEEE 

DIMITRIOS  C.  BOtGOL'UAS.  Student  Member.  IEEE 
Southern  Illinois  University 


The  problem  of  opdinal  data  ftisioa  in  the  sanac  of  (be  Neyman- 
Ptanon  (N-P)  teal  in  a  ceniraliicd  (iuion  center  it  conaidered.  The 
fiiaion  center  receives  data  from  vartona  dittrihntcd  acnaora.  Each 
acnaor  implementa  a  N-P  teat  iadivtdiially  and  independently  of  the 
other  aenaora.  One  to  Bndtstiona  in  channel  capncMv,  the  aensora 
transmit  their  deciaian  inatcnd  of  raw  data.  In  addMan  to  their 
deciaiont.  the  aenaora  may  traaamit  one  or  mot*  bita  of  quality 
Mbnaadoo.  The  optimal,  in  the  N-P  tenae,  dadaioa  scheme  at  the 
Aniaa  center  it  derived  and  it  it  teen  that  an  improvement  in  the 
perfotmaacc  of  the  ayatem  beyond  that  of  the  moat  reliable  senaor  it 
feaaiMt.  even  without  quality  information,  far  a  ayatem  of  three  or 
more  aenaora.  If  quality  iafarmatioo  bMa  are  alao  available  at  the 
fritian  center,  the  perfonaance  of  the  distributed  deciaian  scheme  it 
comparabie  to  that  of  the  ccntraliied  N-P  teat.  Severat  eiampict  art 
provided  and  an  atporithm  for  adjuatina  the  thrcthald  level  at  the 
httion  center  is  provided. 


Manus  npi  received  luly  17.  1986:  revised  March  3.  1987. 

This  research  is  sponsored  by  the  SDIO/IST  and  manafed  by  the  Ofrice 
of  Navai  Research  under  Cram  NCIOOI4-86-K-OSI3. 

Auihon'  address:  Dept,  of  Elcctricai  Enimeerui.  College  of 
Enginenng  and  Technology.  Southern  Illinois  Uloiversiiy.  Caibondale. 
IL  62901. 


0018-9231  87  0900-0644  SI  00  C  1987  IEEE 


The  problem  of  dau  fusion  in  a  central  decision 
center  has  attracted  the  attention  of  several  investigators 
due  to  the  increasing  interest  m  the  deployment  of 
multiple  sensors  for  communication  and  surveillance 
purposes.  Because  of  a  limited  transmission  capacity,  the 
sensors  are  required  to  transmit  their  decision  ( with  or 
without  quality  information  bits)  instead  of  the  raw  data 
the  decisions  are  based  upon.  A  centralized  fusion  center 
is  responsible  for  combining  the  received  information 
from  the  vanous  sensors  into  a  final  decision. 

Tenney  and  Sandell  ( 1 )  have  treated  the  Bayesian 
detection  problem  with  distributed  sensors.  However, 
they  did  not  consider  the  design  of  data  fusion 
algorithms  Sadjadi  [2]  has  considered  the  problem  of 
general  hypothesis  testing  in  a  distributed  environment 
and  has  provided  a  solution  m  terms  of  a  number  of 
coupled  equations  The  decentralized  sequential  detection 
problem  has  been  investigated  in  (3-5].  Chair  and 
Varshney  [6]  have  considered  the  problem  of  data  fusion 
in  a  central  center  when  the  dau  that  the  fusion  center 
receives  consist  of  the  decisions  made  by  each  sensor 
individually  and  independently  from  each  other.  They 
derive  the  optimal  fusion  nile  for  the  likelihood  ratio 
(LR)  test.  It  turns  out  that  the  sufficient  statistics  for  the 
LR  test  is  a  weighted  average  of  the  decisions  of  the 
various  sensors  with  weights  that  are  functions  of  the 
individual  probabilities  of  false  alarm  and  the 
probabilities  of  detection  However,  the  maximum  a- 
posterion  (MAP)  test  or  the  LR  test  require  either  exact 
knowledge  of  the  a-priori  probabilities  of  the  tested 
hypotheses  or  the  assumption  that  all  hypotheses  are 
equally  likely.  However,  if  the  Neyman-Pearson  i.N'P)  test 
is  employed  at  each  sensor,  the  same  test  must  be  used  to 
fuse  the  dau  at  the  fusion  center,  in  order  to  maximize 
the  probability  of  detection  for  fixed  probability  of  false 
alarm. 

We  derive  the  optimal  decision  scheme  when  the  N-P 
test  IS  used  at  the  fusion  center.  The  optimal  decision 
scheme,  in  the  N-P  sense,  is  derived:  1)  for  cases  where 
the  various  sensors  transmit  exclusively  their  decisions  to 
the  fusion  center,  and  2)  for  cases  where  the  various 
sensors  transmit  quality  bits  along  with  their  decisions 
indicating  the  degree  of  their  confidence  in  their  decision. 

II.  DECISION  FUSION  WITH  THE  NEYMAN- 

PEARSON  TEST 

Consider  tf  e  problem  of  two  hypotheses  testing  w  ith 
H|  designating  one  hypothesis  and  Ho  the  alternative 
Assume  that  the  prior  probabilities  on  the  two  hypotheses 
are  not  known.  A  number  of  sensors  N  receive 
observations  and  independently  implement  the  .N-P  test. 
Let  u,  designate  the  dKision  of  the  ;th  sensor  having 
taken  into  account  all  the  observations  available  to  this 
sensor  at  the  time  of  the  decision.  If  the  decision  of  the 
>th  sensor  favors  hypothesis  H,.  the  sensor  sets  u.  = 
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I .  ocherwise  it  sets  u,  *  -  I .  Every  sensor  transmits 
Its  decision  to  the  fusion  center,  so  that  the  fusion  center 
has  all  iV  decisions  available  for  processing  at  the  time  of 
the  decision  making.  Let  iPf  .  Pq  )  designate  the  pair  of 
the  probability  of  false  alarm  and  the  probability  of 
detection  at  which  the  >th  sensor  operates  and  implements 
the  N-P  test.  The  fusion  center  implements  the  N-P  test 
using  all  the  decisions  that  the  individual  sensors  have 
communicated,  i.e..  it  formulates  the  LR  test: 


Mu) 


Plu,  .u-.. 


|H,) 


PiUfU; . Uv  I  Ho) 


Hi 

«  / 

a, 


where  u  •  (ui.u; . Uv)  is  a  1  k  N  row  vector  with 

entnes  the  decisions  of  the  individual  sensors,  and  i  the 
threshold  to  be  determined  by  the  desirable  probability  of 


false  alarm  at  the  fu.sion  center  i.e. 


2  Z’(A(u)|Ho)  =  P'f 


Since  .isions  of  each  sensor  are  independent 
from  each  orher.  the  LR  test  ( I )  gives 


.\(u) 


.•1  h« 


(3) 


from  which  the  result  in  (6|  is  readily  obtained.  In  order 
to  implement  the  N-P  test  we  need  to  compute 
^(.\(tt)|Ho)  However,  due  to  the  independence 
assumption,  it  is  easier  to  obtain  the  distribution  Pilog 
.\(u)|Ho>  which  can  be  expressed  as  the  convolution  of 
the  individual  Pdog  .\lu,)|Ho).  Thus,  it  follows  from  (3) 


Pdog  .V(u)|Ho) 

=  Pdog  .V(u,)IHo)*  •  •  •  *  Pdog  .\(Uv)|Ho).  i4) 

The  LR  assumes  two  values.  Either  ( 1  -  Pq  i 
(1  -  Pf  )  when  H,  =  0  with  probability  1  -  P^  under 
hypothesis  Hg  and  probability  I  -  Pq  under  hypothesis 
H|.  or.  Po,  Pf  when  u,  *  I  with,  probability  P^  under 
hypothesis  Hg  and  probability  Po,  under  hypothesis  H, 
Hence,  we  can  write 


Pdog  .\(«,)|Ho)  »  (1  -Pf )  8  ^log.\(H, 

-/>,«( 

and 

Pdog  .\(u,)|H,)  =  (1  -Pd)  5  ^logA(«,) 
1  -  Po  \ 

-  Pj 


Po 

log.\(u,)  -  log  — 
Pf 


(5) 


-  log 


-  Po  S 


I  logAiu 


Pn  \ 

tog-) 


i6i 


where 
Six)  » 


I  for  X 
0  for  X 


•  0 
■  0 


At  the  fusion  center,  the  probability  oi  ijlse  alarm 


(I) 


Pf  •  2  /’(A(u)IHo) 


where  /*  is  a  threshold  chosen  to  sati'i'  i^)  for  a  ciwn 
P'f  Similarly,  the  probability  of  deieiimn  at  the  luMon 
center 


P'd  =  2  Z’(A(u)|H,) 


i8p 


i2) 


A.  Similar  Sensors 


When  all  the  sensors  are  similar  and  operate  at  the 
same  level  of  probability  of  false  alarm  and  probability  of 
detection,  i  e  .  Pf  »  Pf  =  Pf  and  =  Po  =  Po  for 
every  i  and  j.  all  ttw  probability  distributions  in  1 3)  are 
the  same  and  the  N-P  test  leads  to  the  lollowing  scheme 
at  the  fusion  center.  (Expression  simil.ir  to  (9)  and  f  lOi 
were  obtained  in  (6)  for  the  LR  test  ) 


1 10) 


If  k  out  of  the  N  decisions  favor  hypmhcsis  H, .  i'))  can 
be  rewritten  as 


/.  fPD))-PF)}\  ...  1)  -  Pf\ 


Ill) 


Po.  Hence,  log 

>  0  and  the  N-P  test  becomes 


For  all  sensible  tests,  though.  Pf 
Pod  -  Pf) 

Pf  d-Po) 


H, 

t  S  f 
Ho 


III 


where  r*  is  some  threshold  to  be  determined  so  that  a 
certain  overall  false  alarm  probability  P'f  is  attained  at 
the  fusion  center. 

The  random  variable  has  a  binomial  distnbutmn 
with  parameteis  N  and  Pp  under  Hg  and  parameters  v  and 
Po  under  H,.  Hence.  P^  and  the  overall  probabu  ts  of 
detection  P{,  are  given  by 
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♦) 


V  H, 

3]  <j,«,  ^  t 

H„ 

(9) 

where 

log  ^ 

es). 

Pf) 

if  u,  =  •  1 . 

1  =  1. 

,  \ 

a.  * 

log  ^ 

I  -  Po) 

if  H,  =  -  1  . 

1  =  1. 

.  .V. 

» 

•  • 


'  <13) 

'‘“-ii('')'’“'‘-'’»''^ 

where  \if\  indicates  the  smallest  integer  exceeding  f 
The  threshold  t*  must  be  determined  so  that  ( 12)  gives  an 
acceptable  overall  probability  of  false  alarm. 

For  the  configuration  of  .V  sensors,  we  are  interested 
to  know  whether  the  N-P  test  can  provide  a  iP'f.P'o)  pair 
such  that 

f’^Smin{/’^|  and  /*£>  >  max{/*o  |  (15) 

.ev  ,e.v 

where  i  /*£> .  /’f  i  is  the  .\-P  test  level  for  sensor  i. 

1=  1.  . V. 

The  next  Theorem  shows  that  condition  t  ISi  can  be 
satisfied  if  the  randomized  N-P  test  is  used  at  the  fusion 
center,  the  number  of  sensors  S  is  greater  than  two.  and 
all  the  sensors  are  characterized  by  the  same  {Pf.Po) 
pair. 

Thtorem.  In  a  cor^guration  of  N  Similar  sensors, 
all  operating  at  the  same  iPf.  Pol  =■  <p.  <?)•  ‘he 
randomized  N-P  test  at  the  fusion  center  can  provide  a 
iP'f.  P‘ol  satisfying  115)  if  N  S  J. 

More  precisely,  for  N  S  i.  the  randomized  N-P  test 
can  be  fixed  so  that 

P'f  ^  Pf  ~  p  and  P'o  ^  Po  ^  <1 

where  Pf  and  Pp  are  the  probability  of  false  alarm  and 
probability  of  detection  at  the  individual  sensors. 

Proof  First  we  show  that  for  N  *  2.  condition  ( 15) 
cannot  be  satisfied  with  the  second  inequality  as  a  strict 
one.  Then  we  prove  that  for  .V  =  3.  the  randomized  N-P 
test  satisfies  condition  ( 15).  By  using  the  fact  that  for 
fixed  probability  of  false  alarm,  the  probability  of 
detection  at  the  fusion  center  is  maximized  by  the  N-P 
test  among  all  mappings  from  the  observation  space  into 
the  decision  space,  we  prove  by  induction  that  condition 
( 15)  IS  satisfied  for  all  iV  S  3. 

Let  ;V  a  2  and  (Pf.Pq!  *  (P-  sensors. 

Using  (4).  (5).  '(6),  (9),  and  ( 10).  the  LR  distnbutions  at 
the  fusion  center  under  hypothesis  Hq  and  H,  are  plotted 
for  the  reader's  convenience  in  Figs.  1  and  2. 
respectively.  Since  for  all  p  in  (0.  I) 

p~  <  p  <  2p(l  -p)  *■  p-  (17) 

it  follows  that,  in  order  to  satisfy  -  p.  the 
randomized  N-P  test  must  be  us' d  at  the  fusion  center 
with  threshold  </(  1  -  q)/p(  1  -p;  and  randomizing  factor 
01  defined  by 

p-  -I-  (i)2p(l  -p)  =  p  (18) 

where  0  <  u>  <  1.  Solving  ( 18)  we  obtain  ui  *  0.5. 
independent  of  p.  Since  is  determined  by  an 
expression  symmetric  to  (18)  (see  Figs.  I  and  2).  Pfc  = 


Fig  I  DisintMUioo  of  LR  u  fiuioa  cenier  under  liypoiliesii  H.,  tor 
rwo  siflulnr  sensor  system,  v  •  : 

r<A(u)|H, 1 


Fig.  2.  Dismbunoa  of  LR  ai  fustoo  center  under  hypothesis  H.  tor 
iwo  similar  sensor  system.  .V  >  2. 

<7  for  ti>  s  0.5.  Hence,  neither  condition  ( 16)  nor 
condition  (IS)  (which  is  more  restrictive)  can  be  satisfied 
fotN  »  2. 

Let  N  3.  The  distributions  of  the  LR  under  H,>  and 
H|  are  given  in  Figs.  3  and  4.  respectively  From  Fig  3. 

P(«(u)|Ho) 

jnii-pjZ 

)p2r , .pj 

9 

1 


'  t-n  )3  at'n)^  1^1 1 -D  a  ! 

'  ad-pi*  ' 

Fig.  3.  Dismbution  of  LR  at  fusion  cenier  under  hypothesis  H.  tor 
duee  suniJar  sensor  system.  V  >  J 

P(A(u>|M|l 


1  '-n  [3  »(i-e)^  j  • 

0^(1 -pi  ' 


Fig.  4  Disinbuiion  of  LR  ai  fusion  center  under  hypuihesi>  H  r 
itiree  similar  sensor  system.  .V  >  3 
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if  the  thieshoid  at  the  fusion  center  is  set  at  q-{  t  -  q}/ 
p*(i  -p). 

P'f  *  3p-(  I  -  p)  <  p  ( 19» 

for  0  <  p  <  0.5.  The  left-hand  side  (LHSi  of  inequality 
( 19)  IS  greater  than  p  for  p  >  0.5.  Hence,  since  Pf  < 

0.5.  the  randomized  N-P  test  that  satisfies  ( 15)  at  the 
fusion  center  is  determined  by 

p'  -  3p-(  I  -p)  u»3pi  I  -p)*  =  p  (20) 

from  which 


Hence,  w  is  a  positive  fraction  for  0  <  p  <  0  5. 

Since  P'o  at  the  fusion  center  is  given  by  an 
expression  similar  to  (20)  (see  Fig.  4).  with  q  in  place  of 
p.  and  q  >  0.5.  it  follows  from  (20)  that  P'o  >  q,  which 
proves  the  Theorem  for  .V  =  3. 

Assume  that  the  randomized  N-P  test  satisfies 
condition  ( 16)  for  an  arbitrary  number  of  sensors  N.  We 
show  that  it  also  satisfies  the  condition  for  N  I .  and 
thus  complete  the  induction  and  the  proof  of  the 
Theorem. 

Let  C/v  =>  {ui  ■  u; . uv}  designate  the  set  of 

decisions  from  the  N  sensors  that  are  available  at  the 
fusion  center  All  the  sensors  operate  at  the  same  level 
(p.  q).  Let/v(L'v)  designate  some  decision  rule  at  the 
fusion  center  operating  at  fixed  probability  of  false  alarm 
P  Let/^  ^  (f/v)  designate  the  randomized  N-P  test  at  the 
fusion  center  at  level  p.  For  fixed  probability  of  false 
alarm,  the  probability  of  detection  at  the  fusion  center 
(power  of  test)  is  maximized  for  the  N-P  decision  rule 
among  all  possible  decision  rules. 

Let  f/v-i  *  {Cfv.  «v.i}  designate  the  decision 
ensemble  of  .V  I  similar  sensors  all  operating  at  the 
same  level  (p.  q).  Then  by  choosing /v»|(t/v.i)  * 

/?•'’(  f/v). 

=  max  Po(/y*i(t/,v*i)) 

/v  •  I  <  ^  \  ^ 

S  PoU^iUs))  (22) 

from  which  it  follows  that 
P'o  V  S  P'o  \  >  9- 

Thus  the  induction  is  complete  and  so  is  the  proof  of 
the  Theorem. 

Consider  a  sy  tern  of  four  sensors  N  »  4  all  operating 
at  Pf  *  0.05  ar  J  Pq  =  0-95.  If  r/  *  2,  from  the 
binomial  cumulative  table  we  get  P'f  =•  0.014  and  P'g  = 
0.9995  at  the  fusion  center,  i.e..  a  considerable 
improvement  in  the  performance  of  the  overall  system. 
From  the  binomial  cumulative  table  it  can  be  seen  that  at 
least  three  sensors  are  required  for  the  decision  fusion 
scheme  to  improve  the  performance  of  the  system,  as  the 
Theorem  suggests. 


To  assess  (he  performance  of  the  fusion  scheme 
further,  we  compare  it  with  the  best  centralized  scheme, 
the  N-P  test  which  utilizes  raw  data,  not  decisions,  from 
the  different  sensors.  The  loss  associated  with  the  use  of 
decisions  instead  of  raw  data  at  the  fusion  center,  is 
assessed  by  means  of  a  simple  example  Let  a  single 
observation  from  each  of  the  four  (N  =  4)  sensors  be 
distnbuteo  normally  (see  Fig  5)  as 

r,  ~  C(0.  1).  under  Ho 

'  G(5.  1).  under  H, . 


Fig  5  Dau  disinbuiion  u  each  seiuor  under  hypotheses  H,  and  H.,. 
and  conftdeiKe  legions.  Threshold  is  indicated  by  T  The  intervals 
I  -  K.  Tti  and  iTt,  .  x|  are  designated  "cont'idence  '  regions  Interval 
iT^.  TlI  is  designated  "no  confidence  '  region 

The  N-P  test  utilizing  all  the  r,s  will  have  the  form 

'k  r>  124) 

r«l 

To  achieve  a  false  alarm  P^.  a  threshold  of 

t»  =  e*'  (P^)  i25i 

IS  needed  at  the  fusion  center,  where  fit  )  *  I  -  d*!  i. 
with  d>(  )  the  cumulative  distribution  function  (cdf)  of 
the  standard  normai.  and  Q''  is  the  inverse  function  of 
Q  .Moreover. 


To  obtain  a  P^  *  0.05  and  Pq  =  0.95  at  each 
sensor,  a  signal  satisfying  r,  =  Q~'  (0.05)  is  required, 
from  which  r,  =  1.64.  and  0.05  =  1  - 
which  S  =  3.29. 

Consider  achieving  a  PJ  »  0.001  at  the  fusion  center 
with  the  four  sensors.  This  requires  a  threshold  »(,  =  2 
Q''(0.00l)  =  6. 18.  from  which  Pp  =  0.9998  (see  1 25) 
and  (26)). 

This  example  shows  that  the  best  decentralized  fusion 
scheme  achieves  a  {P'f,  P'o)  *  (0.014.  0.9995).  whereas 
the  best  centralized  fusio’.  scheme  achieves  a  (P^.  P%) 

=  (O.OOl .  0.9998)  for  the  same  sensors.  Clearly  the  loss 
in  power  associated  with  transmitting  highly  condensed 
information  from  the  sensors  to  the  fusion  center  is 
causing  the  degradation  in  the  performance  of  the  fusion 
scheme.  As  a  compromise,  a  multibit  information  could 
be  transmitted  to  the  fusion  center  containing  quality 
information  related  to  the  degree  of  confidence  that  a 
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sensor  has  about  ib  •  lecision  along  with  the  decision 
Itself.  This  situation  is  examined  in  Section  III. 

Table  I  gives  the  different  N-P  test  thresholds  that  the 
fusion  center  can  operate  so  that  condition  ( IS)  is 
satisfied.  The  thresholds  were  found  using  the  mteracttve 
fusion  algorithm  (IFa)  that  we  developed  (see  the 
Appendix). 


TABLE  1 

Decision  Fusion  5 

Sensor  System 

Sensors  PF 

Equal 

Unequal  _ 

Sensors  PD 

Equal 

Unequal  _ 

Probibiliiy 

Prooabiiity 

Ttireshold 

of  Oeieciion 

of  False  Altfm 

■a  Fusion  Center 

'it  Fusion  Centei 

(i3  Fusion  Center 

PDMAX  »  95000 

PFMIN  -  50000E-01 

!• 

PD 

PF 

6859  0 

977407 

300000E-04 

19  000 

998842 

1I58I2E-02 

5263IE-0I 

999970 

2259:5E-01 

1  SENSOR  OFF 

PDMAX  •  95000 

PFMIN  -  50000E  01 

I* 

PD 

PF 

361  00 

985981 

481250E03 

I.OOCiO 

999519 

140187E01 

2  SENSORS  OFF 

PDMAX  •  95000 

PFMIN  •  50000E  01 

t* 

PO 

PF 

19  000 

992750 

72S000E02 

B.  Oisimilar  Sensors 

Case  I .  All  the  sensors  operate  at  the  same 
probability  of  false  alarm  level  P^.  but  different  levels  of 
probability  of  detection  from  each  other,  i.e..  Pg  Pg  - 
i  ^  Without  loss  of  generality  we  assume  the 
ranking  Pp,  ^  Po,  ^  ^  from  which  the 

following  ordering  in  the  abscissae  of  the  conditional 
distribution  of  the  individual  LRs  results: 

1  '  Fo,  I  -  Po,  I  -  Po, 

- L  <  - :  <  •■■  <  - i 

\  -  Pr  \  -Pf  I  ~  p. 


The  conditional  distribution  of  the  compound  I  R  at 
the  fusion  center  is  obtained  by  convolving  the  in  dividual 
distributions,  using  the  IFA.  Convolution  of  the 
distributions  P(log  .V(u,)|Ht)  corresponds  to  linear  shifts 
of  their  logarithmic  abscissae,  which  is  translated  into 
addition  of  logarithms.  Hence,  the  distribution  of  the  LR 
P(A(u)|Ht)  at  the  fusion  center  can  be  obtained  directly 
by  multiplication  of  the  abscissae  of  the  P(A(«),)|Ht). 
Hence  the  point  of  the  distribution  P(A(u)|Hi)  which  is 


closest  to  the  origui  has  abscissa  — - - ^ 

(1-Ph' 

and  ordinate  (1  -  Pgj  (I  -  Pgj  under  H,  or 
( I  ~  P^)''  under  Hq.  On  the  other  hand,  the  point  farthest 


apart  from  the  origin  has  abscissa 


p; 


'  and 


ordinate  Pg^  ■  ■  ■  Pg^  under  Hi  or  P?  under  Hj.  In 
between  these  two  extreme  points,  the  abscissae  of  the 


P 


distribution  of  the  compound  LR  have  the  form  PI  — 

€S  P, 

'  o 

n.  ,  „  where  5  is  a  subset  of  integers  from  i  1 .  2 . 

;  =  S  1 

..  .V}  and  5  its  complement  with  respect  to  this  set  The 
corresponding  ordinates  are  R  Pg  Hjl  -  Pg  '  under  H . 


<ei  /es 


or  P/  (1  -  Pf)  ^  under  Ho.  where  ifli  designates  the 
cardinality  of  the  set  ft.  Once  the  distnbution  of  the 
compound  LR  is  determined,  the  threshold  at  the  fusion 
center  can  be  determined  to  satisfy  a  given  probabihiv  of 
false  alarm  P^f  from  which  the  probability  of  detection 
P^  IS  determined.  At  the  fusion  center  we  want  to  set-up 
the  threshold  so  that  P^  s  P,  while  P^  >  max  {Pg  [ 

This  is  achieved  by  the  IFA  as  the  following  example 
illustrates. 


Consider  a  five-sensor  system.  All  the  sensors  operate 
at  the  same  level  Pf  »  0.05.  However,  due  to  different 
noise  environments  or  quality  of  the  sensors,  they  yield 
different  PgS  as  Table  II  indicates 


Tabu  II 

Probability  Of  Detection  At  Ttie  Individual  Sensors  For  Die  Sjme 
Probability  Of  False  Alarm  In  A  Five  Sensor  System 


1 

1 

2 

3 

1  a 

<  1 

Po 

0.95 

094 

0  93 

0  92 

7  91  ' 

Table  111  summarizes  all  the  choices  of  thresholds  at 
the  fusion  center  that  satisfy  condition  (15)  as  given  by 
the  IFA.  A  significant  improvement  in  the  system 
performance  is  achieved  by  fusing  the  individual 
decisions. 

Case  2.  The  different  sensors  operate  at  diiferent 
probabilities  of  false  alarm  and  probabilities  of  detection, 
i.e..  Pf  #  Pf^  and  Pg  ^  Pp  .  i  ^  J-  The  dismbution  of 
the  cumulative  LR  of  the  fusion  center  is  obtained 
numerically  as  in  case  2.  and  the  threshold  iT  >s  lound  to 
satisfy  a  given  P^.  Ideally,  the  threshold  tT  mu-,t  be 
chosen  so  that  condition  (IS)  is  satisfied  However  this 
may  not  always  be  feasible  The  following  exampiev 
illustrate  the  procedure. 

We  consi^r  three  different  systems  with  five.  :our 
and  three  sensors.  Each  system  results  by  eliminating  the 
sensor  with  the  lowest  Pg  from  the  system  that  has  -  ne 
more  sensor.  For  the  five-sensor  system,  the  iP.j  P.  of 
the  sensors  are  given  in  Table  IV. 

Table  V  summarizes  the  results  as  obtained  bv  IFa 
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TABLE  III 

OecisuM  Fiwob  3 

Senior  System 

Seniors  PF 

Unequal 

Season  PO 

_ 

Unequal 

Probibiliiy 

Probability 

Threshold 

of  Detecuon 

of  False  Alarm 

<3  Fushm  Center 

(a  Fusion  Center  (3  Fusion  Center 

PDMAX  «  93000 

PFMIN  •  30000E  01 

(• 

PO 

PF 

6163  : 

957817 

300000E-04 

33  004 

963797 

142812E'03 

43  m 

968973 

333623E-03 

40  339 

973523 

368437E-03 

38  907 

977913 

481230E-03 

34  308 

981772 

394063E-03 

3;. 081 

983391 

706874E03 

39  610 

988731 

819687E-03 

38  307 

991913 

932499E-03 

34  416 

994668 

104331E02 

30  705 

997003 

113813E-02 

30998 

997434 

330136E-03 

17806 

997833 

S44500E-02 

15413 

998163 

738843E-02 

14683 

998480 

973187E-02 

13333 

998771 

118733E-01 

12709 

999043 

140187E-01 

11174 

999282 

161622E-01 

10778 

999313 

183036E-01 

94760E-01 

999717 

204490E-01 

82033E-01 

999892 

223923E-01 

TABLE  IV 

Probability  Of  False  Alaim  And  Oetection  For  A  Five-Sensor  System 
With  Oisimtitr  Sensors 


1 

1 

2 

3 

4 

3 

Pr 

003 

004 

0.03 

002 

001 

Po 

0.93 

- 

0.94 

093 

0.92 

0  91 

In  all  cases,  a  significant  improvement  in  the  performance 
of  the  system  is  achieved  from  fusing  the  decisions. 


III.  TRANSMISSION  OF  DECISIONS  ALONG  WITH 
QUALITY  INFORMATION 

Consider  the  case  where  the  yth  sensor  transmits 
quality  information  bits  to  the  fusion  center  about  its 
decision  along  with  the  decision  itself.  The  sir  iplest  case 
corresponds  to  the  transmission  of  binary  {0,  1}  quality 
information  indicating  the  degree  of  confidence  that  the 
sensor  has  on  the  decision  that  it  transmits.  Under  the 
scenario,  a  bit  one  indicates  ‘'conndence”,  whereas  a  bit 
zero  indicates  “no  confidence".  Rg.  5  illustrates  how  the 
binary  quality  bit  c  is  defuied.  A  strip  (T^,.  T^)  about  the 
threshold  T  of  an  individual  sensor  is  designated  as  region 
of  no  confidence  and  the  bit  c  >  0  is  transmitted  along 


TABLE  V 


Oecisioo  Fusion  3 

System 

Senson  PF 

Ffjgul 

Unequal  x 

Senson  PO 

Equal  _ 

(.  nequal  ji. 

Probability 

Probabiiiiy 

Threshold 

of  Detection 

of  False  .Alarm 

■3  Fusion  Center 

<q  Fusion  Center  fa  Fusion  Center 

PDMAX  >  93UX) 

PFMIN  =  lOOOOE-01 

t* 

PO 

PF 

37882. 

937817 

369300E-05 

436  86 

960153 

8I6400E-03 

373  63 

962908 

155360E-04 

358  72 

966248 

248480E-04 

284  83 

969430 

360300EI34 

373  46 

973289 

301330E-04 

339  36 

977840 

691439E  04 

160  34 

981439 

9171S9E-04 

133  94 

983848 

120338E03 

134  74 

991024 

158640E  03 

103.72 

997003 

216833E03 

99369 

997179 

393780E-03 

73752 

997382 

661908E-03 

66303 

997622 

102314E-02 

63660 

997912 

147942E-02 

42643 

998143 

202113E-02 

37323 

998416 

273098E-02 

33836 

998746 

367387E-02 

38434 

999061 

477889E02 

27319 

999M2 

617S98E-02 

23912 

999892 

803816E.02 

1  SENSOR  OFF 

PDMAX  -  95000 

PFMIN  •  :0CIOOE  Ol 

I* 

PO 

PF 

1139  9 

976981 

150400E-03 

4  69(» 

979548 

697600E03 

4  1038 

982575 

I43480E-03 

3  9420 

986246 

236600E-02 

3  1300 

989742 

348320E-03 

3  0051 

993983 

489440E02 

2.6303 

998984 

679360E02 

2  SENSORS  OFF 

PDMAX  »  93000 

PFMIN  »  30000E-01 

t* 

PO 

PF 

32.222 

989720 

4580O0E0: 

with  the  decision  when  the  observation  r  falls  into  this 
region.  The  two  regions  forming  the  compliment  of  the 
(Ti..  T.)  region  are  considered  confidence  regions  and  the 
bit  c  >  1  is  transmitted  along  with  the  decision  when  he 
observations  fall  into  one  of  the  two  regions. 

The  joint  probability  distribution  of  (u.r)  (skipping 
the  sensor  index  for  simplicity)  can  be  easily  obtained 
from 

/»(«.c|H*)  -  /»(c|ii.Ht)P(H|ttt).  *  =  0.1  i:'i 

where  P(tt|H*).  «  »  s  1  and  *  =  0,  I  is  specified  d> 

Pf  and  Pp.  and  referring  to  Fig.  5. 


THOMOPOLLOS  ET  AL  MULTIPLE  SENSOR  SYSTEM  OPTIMAL  OEOSION  FUSION 


I 


I 


I 


I 


» 


i 


» 


i 


•••••••• 


•  • 


P(c\u.  Ht ) 


L 


Jin  o  IV 


dP(r\Ht)/ 
dP(r\Hi,)  *  Cm 


/*(<,■  »0|«  * 


1  /*(c=  1  |m  = 


i 


l.H»)  »  dP{r\H,) 

Jill 

I  dP(rlHj  =  Ci, 

Jill  o  IV 

-  I.H*)  =  I  dP(riH,). 

Jii 

I  dP(r!H,t  =  C,‘» 

Ji  -  II 

-  l.H»)  = 

f  dPUlH,)  =  Co 

Ji  j  II 

•  28) 


fori  =•  0.  I. 

Hence,  for  every  sensor 

Piu^i.  c*>|H*)  »  C‘ 

)  »  -  I .  I .  and  y  =  0.  I  ( 29) 


and 


/*(.\(«.c)|H, )  =  Pr(i  out  of  .V  decisions  tavor  H,  and. 

n  out  of  these  k  decisions  have 
confidence  index  I  and.  m  out  of 
the  .V  -  i  decisions  that  favor  H„ 
have  confidence  index  T  H,  | 

•[C|o]-[l  -Clol'-*-' 


Similarly. 

/•(A(ii.c)iHo)  «  (‘)lC?,r  II 

•('*) 


(34) 


from  which 

.V  t  .V-l 


C) 


[c?,r 


imtl  fimfj 


{i-c?or'‘--(^)  pu\-Pf) 


(35) 


,  /*(«»!.  c=y|H|)  Cl,P(u  =  i\H,) 

•'  /»(«-,.  c-yl Ho)  C'',/>(«  =  i|Ho) 

I  =•  -  I.  I.  and y  =  0.  1 .  (30) 

Combining  (6)  and  (22)  we  obtain 

/>(.\(«.c)!H,)  =  C|, />o8(.\(i4.c) 


>  Ciod  -Po)«|^.\(«.c) 
Clod  -Po)  h(  \(u.c) 


Cqo  (I  -  Ppt  \ 
CSod  -/’r»  ' 

Clod  -/>o)\ 
Clod  -P^il 
(31) 


Similarly.  /’(.\(«.c)|Ho)  is  obtained  from  (29)  by 
substituting  Po  with  Pf  in  the  product-weights  of  the 
delta  functions.  Therefore,  the  probability  distribution  of 
the  LR  at  the  fusion  center  is  given  by  the  convolution 

/•(log  .\(ii.c)|Hd  =  Pdog  .\(ii|,C|)|H») 

•  •  />(log  .\(ttv.Cv)|Ht).  (32) 

In  the  case  where  all  the  sensors  operate  at  the  same 
level  {Pf  .  Po)  the  mathematics  simplify  somewhat,  since 


The  P^o  obuined  by  an  expression  similar  to  i3Si  iviih 
Po  in  place  of  A’f  and  the  index  I  instead  of  0  above  C , 
The  thresholds  rf .  rf ,  and  rf  are  to  be  determined  to 
satisfy  a  given  probability  of  false  alarm  at  the  fusion 
center.  Notice  that  more  than  one  set  of  thresholds  can 
yield  the  same  Clearly,  the  set  that  results  in  the 
highest  P^o  must  be  selected. 

From  (35)  it  can  be  seen  that  a  superior  performance 
in  regards  to  (P^.  P^)  can  be  achieved  when  quality 
information  is  transmitted  along  with  the  decisions.  The 
improvement  in  performance  of  the  fusion  center  when 
quality  information  bits  are  transmitted  comes  from  the 
fact  that  the  summation  over  P(A(«.c)|Ht)  can  be  made 
finer  with  the  three  different  thresholds.  To  show  that, 
consider  the  example  of  Section  IIA.  In  this  example  four 
similar  sensors  \*4.  operate  at  P^  «  0.05  and  Pq  = 

0  95  from  received  ..ata  r,  —  N  (0.  1)  under  Ho  and  r  - 
.V  (5  =  3.29.  1)  under  H|.  The  threshold  at  each  sensor  is 
set  to  f,  =  1.64  to  satisfy  Pf.  Using  Fig.  5  and  the 
prev  JUS  equations,  we  obtain  for  ,  =  0.8r  =  1312 
and  r. ,  =  1.2,  and  r,  =  1.968  the  C*s  that  are  given  in 
Table  VI. 

Using  the  IFA,  it  follows  that  there  is  a  choice  of  33 
different  thresholds  that  the  fusion  center  can  operate  so 
that  ( 15)  IS  satisfied  as  shown  in  Table  VII.  It  can  be 
seen  from  this  table  that  there  is  a  significant 
improvement  in  the  performance  of  the  overall  system 
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TABLE  VI  TABLE  V  111 

Quality  Bii  Coefficienii  For  Gaussian  Otsmbuud  Oaia  Compatauvc  Rasulis  From  3  Differani  Fusion  S>4i*ini  th  Four 


- 1 

IN -41  Sensors.  All  Operaimg  Ai  Level  iP.  Pqi-'OOS  0'r5iWnen 

H. 

H, 

Ho  1 

The  Individual  Sensors  Transmit 

C. 

1 

c,, 

0-M)  1 

1 

Cji 

0  54 

Only  decisions  I  0  014  v995 

Coo 

0  04?  ' 

C,0 

0  953  1 

Raw  dau  1  Best  crntnlued  N-P  tesii  0  001  !  0  9098 

TABLE  Vll 


Decision  Fusion 

4  Sensor  System  with  Quality  Bus 

Sensors  PF  Equal 

X 

Unequal 

SenMjfs  PO  Ei^ual 

\ 

Unequal  _ 

Probabiliiy 

Probability 

Threshold 

of  Detection 

OI  False  Alarm 

a  Fusion  Center 

u  Fusion  Center 

la  Fusion  Center 

PDMAX  »  95000 

PFMIN  -  50000E-OI 

(• 

PO 

PF 

6d3IS. 

956002 

175551E-05 

:0357 

960940 

I99808E-05 

9390 O' 

961918 

:i0220E-05 

2988  7 

963462 

26t876E-05 

2911  9 

980782 

856706E-05 

951  ;t 

981595 

942I3IE-05 

926  74 

990711 

1925806-04 

438  79 

990738 

I93I9IE-04 

302  74 

990880 

197900E-04 

139  65 

990937 

20I943E-04 

136  06 

992362 

306685E.04 

U  446 

992406 

316713E-04 

43  303 

993906 

663I33E4)4 

42  189 

998114 

16604  IE-03 

20  503 

998114 

166033E-03 

14  146 

998129 

l6716tE-03 

13  782 

998524 

1958056-03 

6.5253 

998525 

195924E-03 

6  3575 

998577 

204121E-03 

4  5021 

998579 

204578E-03 

2.0768 

998580 

.204970E-03 

2  0234 

998662 

245637E-03 

1  9713 

999354 

596850E-03 

66097 

999355 

597499E-03 

64397 

999398 

664750E-03 

62741 

999762 

I24555E-02 

29?06 

999763 

124796E-02 

21036 

999763 

124850E-02 

20495 

999771 

128557E-02 

94544E.0I 

999772 

130148E-02 

92113E-Ot 

999810 

17I378E-02 

6695IE-0I 

999810 

17I395E-02 

30090E-OI 

9998II 

175343E-02 

293I6E-OI 

999851 

3I1705E-02 

compared  with  the  individual  sensors  and  the  fusion 
system  without  quality  information.  For  a  comparable 
»  0.9998.  the  P  f  “  0.0013  when  quality  bit 
information  is  transmitted  as  opposed  to  (P'f,  P^)  = 
(0.014.  0.9993)  without  quality  information.  The 
performance  of  the  fusion  center  when  one  quality 
information  bit  is  transmitted  approaches  that  of  the  best 
centralized  N-P  test,  as  Table  VIII  suggests.  It  is 


interesting  to  notice  that  fusion  of  the  decisions  improves 
the  performance  of  the  overall  system  even  in  the  case  oi 
two  sensors  when  quality  information  bits  are  transmuted 
along  with  the  decisions,  as  Table  I.X  indicates  Table  .\ 
shows  the  performance  of  a  three  sensor  system  a  ith 
quality  bits. 


TABLE  IX 


Decision  Fusion 

2  Sensor  Svsiem  vBich  QuduiN  B<in 

Senson  PF  .  Equal 

Unequal 

Sensors  PD  Equal 

Lnequai  _ 

Probabiliiv 

Probabtiitv 

Threshold 

of  Deteciioo 

o(  False  Alarm 

<a  Fusion  Center 

(&  Fusion  Center 

(a  Fusion  Center 

PDMAX  -  95000 

PFMIN  -  5OOOOE  0I 

f 

PD 

PF 

1  0654 

951900 

696499E  02 

1  0380 

995129 

4861  ME  01 

IV.  CONCLUSIONS 


The  problem  of  fusing  decis  ons  from  V  independent 
sensors  in  a  fusion  center  was  considered  We  assumed 
that  each  sensor  transmits  its  decision  to  the  fusion 
center.  The  decision  of  each  individual  sen.jr  is  rased  on 
the  N-P  test.  The  fusion  center  formulates  the  LR  using 
all  the  received  decisions  and  decides  on  which 
hypothesis  is  true  using  the  N-P  test  also  The  pdf  ><t  the 

TABLE  X 


Decision  Fusion  3  Sensor  System  with  Quwui'.  8114 


Sensors  PF  ;  Equal 

Unequal  _ 

Sensors  PD  :  Equal 

Unequal  _ 

Probability 

ProbatJu-t^ 

Threshold 

of  Deieciion 

or  FilhC  \iam 

(a  Fusion  Center 

(a  Fusion  Center 

a  Fubfon  Ccritcr 

PDMAX  -  95000 

PFMI.N  =  5iaJi*E0 

1* 

PD 

PF 

40  645 

985857 

l”dt-E  -d 

13.277 

987683 

W16KUF 

6  1248 

987804 

iq>6<*E  ; 

1  9493 

987994 

;o.'4;;e  ; 

1  8992 

994400 

5ati*<^E  : 

62039 

994500 

yshUiuE  .  : 

60444 

997872 

II14'‘E 

19745 

997890 

II  .’-■•''.'E  ; 

88740E-0I 

998065 

i 

:8243E-01 

998250 
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log  LR  at  (he  fusion  center  was  obuined  as  the 
convolution  of  the  pdfs  of  the  log  LRs  of  the  individual 
sensors  Once  the  pdf  of  the  LR  is  obtained,  the  threshold 
at  the  fusion  center  is  determined  by  a  desired  probability 
of  false  alarm. 

For  a  fusion  system  with  three  or  more  sensors,  all 
the  sensors  operating  at  the  same  (Pf.  Po'i  level,  it  was 
proved  that  if  the  N-P  test  is  used  to  fuse  the  decisions, 
the  probability  of  detection  at  the  fusion  center  exceeds 
that  of  the  individual  sensor  for  the  same  probability  of 
false  alarm  However,  if  the  sensors  operate  at  arbitrary 
iPf.  Po>  levels,  no  general  assessment  can  be  made 
about  the  performance  of  the  fusion  center  since  the 
performance  depends  on  how  far  the  operating  points  of 
the  sensors  are  from  each  other. 

The  problem  of  decision  fusion  when  the  sensors 
transmit  quality  information  bits  indicating  their 
confidence  on  the  decisions  was  also  considered  and  the 
N-P  test  at  the  fusion  center  was  derived.  Several 
numencal  examples  showed  that  use  of  quality 
information  can  improve  the  performance  of  the  fusion 
center  considerably . 

.An  IFA  was  developed  to  solve  the  fusion  problem 
numencally.  Once  one  of  the  three  parameters  ithreshold. 


probability  of  false  alarm,  or  probability  of  detectioni  is 
specified,  the  IFA  determines  the  other  two.  given  the 
probabilities  of  false  alarm  and  detection  of  each 
individual  sensor. 

APPENDIX 

The  IFA  receives  as  data  the  number  of  sensors,  their 
iPf.  Pc,)  levels,  and  the  C‘  quality  information 
parameters  if  the  sensors  transmit  quality  information  bits 
along  with  their  decisions  It  then  computes  the  LR  pdf  at 
the  fusion  center  conditioned  on  each  hypothesis  .After  it 
computes  the  pdf.  it  asks  the  user  which  option  he  she 
prefers  The  alternative  options  are  the  following 

1 1  Display  of  the  entire  pdf 

2f  Threshold  computation  for  a  given  and  display  of 
the  corresponding  P'o- 

3)  Determination  of  the  thresholds  that  satisfy  1 15i 
■fi  Threshold  computation  for  a  given  P'o  and  display  of 
the  corresponding 

5)  Elimination  of  one  or  more  sensors  and  repetition  of 
the  algonthm. 

6)  Quit 
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Correspondence 

Tk>  praMMaaf  <irtil»>  IMm  la  <lilf*al»<  waMf  tjfsunt 
I*  faiMiw*.  D4M(*ata4  nww  pM(  iMr  <>clit»w  akvia  U» 
MM*  ^yatkMM  •  talaa  cnaaf  IM  caaMat*  ilMi  lai«  « 
llaal  Aaaariai  Ikat  Ife*  war  aacittei  art  ind*ptnd«ni 

kaaiaackadMf  caaMUaaMaaaacklqrpallMla  tm  pra*M«  a 
gMMal  praaf  lhal  IlM  afUaail  <aililaa  Kkta*  that  auximixM 
tka  piafcikllHy  af  dataftlaa  al  Ifea  hall  a  far  Iba4  falaa  alarm 
prataMMly  waalala  af  a  Ni3aaaa  ftanaa  laal  (ar  a  raadomiiad 
N'P  laM)  a  Ika  ftalaa  aad  IfcaHatf  ratia  laaa  a  Ua  waMra. 

I.  INTRODUCTION 

Systems  of  disthbuted  sensors  monitoring  a 
common  volume  and  passing  their  decisions  into  a 
centralized  fusion  center  which  further  combines 
them  into  a  final  dedaioa  have  been  itceivini  a  lot  of 
attention  in  recent  years  [1).  Such  systems  are  expected 
to  increase  the  reliability  of  the  detection  and  be  fairly 
immune  to  noise  interference  and  to  failures.  In  a 
number  of  papers  the  problera  of  optimally  fusing 
the  decisions  from  a  number  of  sensors  hu  been 
considered.  Ibnney  and  Sandell  [2]  have  considered 
the  Bayesian  detection  problem  with  distributed 
sensors  without  considering  the  design  of  data  fusion 
algorithms.  Sadjadi  [3]  has  considered  the  problem  of 
hypothesis  testing  in  a  distrfeuted  environment  and  has 
provided  a  solution  in  terms  of  a  number  of  coupled 
nonlinear  equations.  The  decentralized  sequential 
detection  problem  has  been  investigated  in  (4.  5|. 
tn  (6]  it  was  shown  that  the  solution  of  disthbuted 
deteaion  problems  is  nonpolynomial  complete.  Chair 
and  Vhrshney  (7]  have  soM  the  problem  of  data 
fusion  when  the  a-priori  probabilities  of  the  tested 
hypotheses  are  kaown  and  Che  likelihood-ratio  (L-R) 
test  can  be  impfemented  at  the  receiver.  Thomopoulos. 
Viswanathan,  and  Bougoulias  [8,  9]  have  derived  the 
optimal  fusion  rule  for  unknown  a-priori  probabilities 
in  terms  of  the  Neyman-Pearson  (N-P)  test. 

For  the  “paralfer  sensor  topology  of  Fig.  1. 
Srtnivasan  (10|  has  shown  that  the  globally  optimal 
solution  to  the  fusion  problem  that  maximizes  the 
probability  of  detection  for  fixed  probability  of  false 
alarm  when  sensors  iiaiismit  independent,  binary 
decisions  to  the  fusion  center,  consisa  of  L-R  tests 
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Fig.  t.  OiMnbutcd  lentor  fusnn.  Panllcl  topology 
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Fig.  2.  Example  of  lutgulaniy  of  Lagrangun  apptoadi  used  m 
(10)  for  dediioa  fuuoa.  Three  kleniical  senson  in  slow-fading 
Rayleigh  channel.  Paradigm  ukeo  from  (U|. 


at  all  setison  and  a  N<P  test  at  the  fusion  center. 

This  test  will  be  referred  to  as  N-P/L-R  hereafter. 

The  proof  of  the  optimality  of  the  N-P/L-R  test  in 
(10]  is  based  on  the  (ftrst-order)  Lagrange  multipliers 
methods  which  does  not  always  yield  the  optimal 
solution  as  it  is  shown  by  example  in  (11).  Bor  the 
paradigm  in  [11],  the  Lagrangian  approach  fails  to 
yield  to  optimal  solution.  Instead,  it  yields  a  solution 
which  is  by  far  inferior  to  the  optimal  solution,  see  Fig. 
2.  A  deuiled  description  and  analysis  of  this  singular 
case  is  given  in  (11, 12].  A  theoretical  explanation  of 
the  failure  of  the  Lagrange  multipliers  method  can  be 
found  in  (13,  ch.  5,  and  14,  IS]. 

In  general,  if  the  optimal  solution  lies  on  the 
boundary  of  the  domain  of  x  (as  in  the  decision  fusion 
paradigm  in  (11]),  the  Lagrangian  formulation  fails 
to  guarantee  the  convexity  of  the  objective  function, 
and  thus,  the  optimality  of  the  solution  obtained 
using  the  Lagrange  multipliers  method.  In  that  se.ise, 
the  proof  of  optimality  of  the  N-P/L-R  test  for  the 
parallel  sensor  topology  in  (10],  which  is  based  on 
a  Lagrangian  formulation,  is  incomplete.  We  give  a 
complete  proof  of  the  optimality  of  the  N-P/L-R  test 
for  the  distributed  decision  fusion  problem  that  does 
not  depend  on  the  Lagrangian  formulation. 

7« 


II.  OPTUutALITY  Of  N-P/L-R  TEST  IN  DISTRIBUTED 

DECISION  FUSION 

A  number  of  sensors  N  receive  data  from  a 
common  volume.  Sensor  k  receives  data  and 
generates  the  first  suge  decision  ut,  k  >  1,2..  ...V. 

The  decisioas  are  subsequently  transmitted  to  the 
fiision  center  where  they  are  combined  into  a  final 
decision  uo  about  which  of  the  hypotheses  is  true.  Fig. 

1.  Assuming  binary  hypothesis  testing  for  simplicity, 
we  use  u,  a  1  or  0  to  designate  that  sensor  i  favors 
hypotheses  H|  or  Ho.  respectively.  In  order  to  derive 
the  globally  optimal  fusion  rule  we  assume  that  the 
received  data  at  the  A/  sensors  are  statistically 
independent,  conditioned  on  each  hypothesis.  This 
implies  that  the  received  decisions  at  the  fusion  center 
ate  independent  conditioned  on  each  hypothesis. 
Improvement  in  the  performance  of  conventional 
diversity  schemes  is  based  on  the  validity  of  this 
assumption  (16].  Given  a  desired  level  of  probability 
of  false  alarm  at  the  fusion  center,  P/-,  m  qq.  the  test 
that  maximizes  the  probability  of  detection  Pd,  (thus, 
minimizes  the  probability  of  miss  Pu,  >  1  -  Po,)  >s 
the  N-P  test  (17,  18].  Bwause  of  the  comparison  to  a 
threshold  this  test  is  referred  to  as  a  threshold  optimal 
test. 

Next,  we  prow  that  the  optimal  solution  to  the 
fusion  problem  involves  an  N-P  test  at  the  fusion 
center  and  L-R  tests  at  the  sensors. 

Let  d(U|,U2 . um)  be  the  (binary)  decision 

function  (rule)  at  the  fusion.  Since  d(ui,U2 . <^  v) 

is  either  0  or  1,  and  all  the  possible  combinations 

of  decisions  {ui.uj . u/v}  that  the  fusion  center 

can  receive  from  the  N  scnsois  is  2^,  the  set  of  all 
possible  decision  functions  conuin  2^d  funaions. 
However,  not  all  these  functions  d  can  be  threshold 
optimal  as  the  next  Lemma  states. 

Lemma  1.  Ler  the  senson  individual  decisions  Uk 
be  independent  from  each  other  conditioned  on  each 
hypothesis.  Let  Pf,  »  P(Ui  «  1 1  Ho)  be  the  false  alarm 
probability  and  Po,  *  P(.Ui  ■  1 1  Hi)  be  the  probability 
of  detection  at  the  ith  sensors.  Assuming,  without  loss  of 
generality,  that  for  every  sensor  Pq,  >  Pf,,  a  necessary 
condition  for  a  fusion  function  d(ui,ui,...,u\)  to  be 
threshold  optimal  is 

d{At,U~Ai)wl=»d^A„,U-A,)i~\ 

if  A,>Ai  (1) 

where  U  w  {u\,U2,...,un}  <lenotes  the  set  of  the 
peripheral  sensor  decisions.  At,  is  a  set  of  decisions 
with  k  senson  favoring  hypothesis  Hi  {whereas  the 
complement  set  decisions  U-An  favors  hypothesis 
Ho),  and  Am  is  any  set  that  contains  the  decisions 
from  these  k  sensors  [The  symbol  ">  ’’  is  used  to 
indicate  “greater  than”  in  the  standard  multidimensional 
coordinau-wise  sense,  Le.,  Am  >  Ay  if  and  only  if 

>  ttkyi,  i  ■  1,2,...,  A/,  with  at  least  one  holding  as 
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a  aria  uuquaiuy,  >vH*rr  m..  (ut. )  indicates  Ou  decuion 
oftht  Mffw  itfi  jciuor  in  ihe  A^(Ak)  decision  «i:| 

Proof.  Let  Pf,  •  P{u,  •  1 1  Ho)  be  the  false  alarm 
{nobabiKty  and  Poi  *  P{it>  *  1 1  Hi)  be  the  probability 
of  detection  at  the  ith  sensors.  d(Ak,U  -  ^4*)  *  I 
implies  Uiat  the  likelihood  ratio 

p(AiM  -  Aj  I  H,)  _  p(Ai  I  H^)p(U  -  A^  I  Hi)  ^ 
p(/l*.f/ - /t*  I  Ho)  p(<4*|Ho)p(C/-/I*|Ho)  “ 

(2) 

which  in  turn  implies  that,  for  A^  > 

p(A^,U-A^]HO 
-  >1,  I  Ho) 

«  I  Hi)p(/t,  -  Ak  I  H,)p(f/  -  /tj  H,) 

“  p{Ai  I  Ho)p(>l.  -  Ak  I  Ho)p(f/  -  1  Ho) 

.  PiA,\Hi)p(U-A^\Hi) 
-p(/l*|Ho)/r(f/-/l*|Ho)^ 

since,  under  the  assumption  that  Po,  >  Pf,  for  every 
sensor  i, 

Pju,  -  1 1  Hi)  fo,  ^  P(u. -OlHi)  ^  I-Pq, 

P(u, -HHo)  *  Pf.  -  P(ti. -0|Ho)  l-Pfi 

From  (3),  it  follows  that  d(A^.U  -  A^)  ■  1. 

Remark  l.  Functions  that  do  not  satisfy  (2)  cannot 
lead  to  the  set  of  optinul  thresholds.  A  function  d  that 
satisfies  Lemma  1,  is  called  a  monotone  increasing 
function  in  the  context  of  switching  and  automau 
theory,  Ikble  t,  (19]. 

Remark  l  If  Po.  ■  Pf  for  all  sensors,  the  L-R  at 
the  fusion  is  degenerated  to  one,  identically  for  any 
combination  of  the  peripheral  decisions  [9j.  Hence, 
for  any  likeiihood  test,  the  false  alarm  probability 
P/^  and  the  detection  probability  Po,  at  the  fusion  are 
either  a)  both  one.  if  Uk  thresh^  is  less  or  equal  to 
one,  or  b)  both  aero,  if  the  threshold  is  greater  than 
one.  In  the  first  case,  the  fusion  rule  always  favors 
hypothesis  one,  independent  of  the  combination  of 
sensor  decisions,  i.e.,  d{V)  a  1  for  all  f/s,  which  is 
a  monotone  increasing  function  satisfying  Lemma 
1.  In  the  second  case,  the  fusion  rule  always  fovors 
hypothesis  zero,  independent  of  the  combination  of 
sensor  decisions.  Le..  d{V)  •  0  for  all  f/s,  which  is  a 
monotone  increasing  function  satisfying  Lemma  1. 

Remark  3.  If  P©,  <  Pf  for  all  sensors,  th< 
inequality  in  (3)  is  reversed,  and  Lemma  1  still  holds 
with  all  threshold  optimal  decisions  at  the  fusion 
being  monotonically  increasing  functions  of  the  sensor 
decisions. 

Remark  4.  if  for  some  sensors  Po,  >  Pf  while 
for  some  others  Po-  <  Pr,,  Lemma  1  does  not  hold. 


However,  thie  is  an  uninteresting  case,  for  if  we  wish 
to  maxim trr  the  detection  probability  at  the  fusion,  we 
would  either  ignore  the  sensors  for  which  Po  <  P/  . 
or,  randomize  their  decisions  by  flipping  coins  and 
deciding  with  probability  1/2  for  either  one  of  the  two 
hypotheses. 

Lemma  2.  For  any  fbud  threshold  Aq  and  any 
fixed  manotonxc  /Unction  r(U|,u2,...,Mv).  Po,  ‘s  an 
increasing  /unction  of  the  Pq.s,  i  ■  1. 2. . . . ,  iV. 

Proof.  The  decision  function  that  corresponds  to 
the  likelihood  test  at  the  fusion  is  contained  in  the 
set  of  monotone  functions  of  N  variables.  Consider 
one  such  monotone  increasing  decision  function 
d(ui,U2,...,uv).  The  function  d,  when  expressed 
in  sum  of  product  form  in  the  Boolean  sense  [19|. 
conuins  only  some  of  the  literals  ui,...,uv  in  the 
uncomplemented  form  and  none  of  the  complemented 
variabtre  (ai.fi2,....flM).  Since  the  random  variables 

ui,i<2 . UN  are  statistically  independent,  it  is  possible 

to  compute  Po,  knowing  the  P^s  [9,  eq.  (20)-(22)|. 
Ikking  partial  derivatives  of  the  Po,  w.r.t.  Pq  s,  one 
obtains  that  0Po,ldPoi)  >  0  Vi,  Le.,  the  desired  result. 
(As  an  illustratioa,  consider  the  function  d(ui,u2,uj)  - 
ui  +  U2US.  For  this  function  Po,  ■  Po,  +  Pd^Pd,  - 
Fd,(PdiPo,),  from  which,  (dPD,/dPo.)  >  0,  j  »  1,2,3  ) 

Theorem  l.  Under  the  assumption  o/  statistical 
independence  the  sensor  decisions  conditioned  on 
each  hypothesis,  the  optimal  decision  fusion  rule  for  the 
parallel  sensor  topology  consists  of  an  N-P  test  lor,  a 
randomized  N-P  test)  at  the  fusion  and  L-R  tests  at  all 
sensors. 

Proof.  Given  the  decisions  U|,tf2 . u/v  at  the 

fusion  center,  the  best  fusion  rale  which  achieves 
maxifflum  Pa,  for  ffamd  P/^  mao  is  the  N-P  test 
(assuming  that  the  false  alarm  probability  qq  is 
realizable  by  an  N-P  test  at  the  fusion;  the  randomized 
case  is  treated  separately  afterwards).  Call  the  best  test 
at  ihe  fusion  center  r(U|,...,irjv)  ^o-  Fn>m  Lemma 

1.  it  follows  that  the  dediion  funoion  that  corresponds 
to  the  above  test  must  be  one  of  the  monotone 
increasing  functions  d(ui,U2,...,UN).  Assume  that  the 
individual  sensors  use  some  test  other  than  the  L-R 
test  and  are  operating  with  {(Pf,Pd.)  Vi}  such  that 
the  condition  P^  *  oo  is  met  From  (8,  9]  it  is  seen 
that  Pf^ae  function  of  the  Pf-s  only,  and  that  Pd„  is 

a  funaion  of  the  Pd,s  only.  Furthermore,  from  Lemma 

2,  Po,  is  a  monotonk  increasing  function  of  the  Pp  '■ 
Ther^ote,  the  L-R  tests  at  the  sensors  which  opera  >e 
with  {P}^  m  Pf,,Po^)  lead  to  the  best  performance 

at  the  fusion,  since  in  this  case,  the  achieved  P^^  is 
greater  than  or  equal  to  Po,  that  can  be  achieved  w  nh 
any  other  test  at  the  sensors. 

If  the  false  alarm  probability  qq  is  not  achievable 
by  an  N-P  test,  a  randomized  N-P  maximizes  the 
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I.  INTRODUCTION 
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The  peohleiB  of  distribuMd  detatiaa  lavoMag  .V  sensors  is 
coiHMcrtd.  The  conilgnralioo  at  senson  is  serial  in  the  sense  that 
the  ( j-  nth  sensor  paaMs  itt  dadilaa  to  the  yih  senior  and  chat  the 
yih  senaar  dicidas  aaiag  tha  darlalBn  it  racaisras  and  ita  ostn 
nhaereation.  Whan  each  isniar  tmplayf  tha  Niynian  Paaraan  test, 
tha  prahaMHqi  af  dataeliaa  is  waiiailiail  iar  a  givea  probahiUty  at 
Mac  alarm,  at  the  iVth  stage.  With  tsio  sanaors.  tha  aeriai  scheme 
haa  a  parfacmance  hetlar  than  or  eqnal  to  tha  parattei  fhsion 
schaaaa  analymd  in  tha  Maratnra.  Nmnerkai  eaamplH  ittuatrate  the 
glohal  optimitation  hy  the  aalactioo  at  operating  thrcsholda  at  the 
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The  theory  of  distributed  detection  is  receiving  a  lot 
of  attention  in  the  literature  (1-I0|.  Typically,  a  number 
of  sensors  process  the  dau  they  receive  and  decide  m 
favor  of  one  of  the  hypotheses  about  the  origin  ot  ihe 
data.  In  a  twoKilass  division  problem,  the  hypotheses 
would  be  signal  present  (H,)  or  the  signal  absent  iHoi 
These  decisions  are  then  sent  to  a  fusion  center  where  a 
final  decision  regarding  the  presence  of  the  signal  is 
made.  This  scheme,  which  can  be  termed  parallel 
decision  making,  is  shown  in  Fig.  I .  In  order  to 
maximize  the  probability  of  detection  at  the  fusion  center 
for  a  fixed  probability  of  false  alarm,  the  tests  used  at  the 
fusion  center  and  at  the  sensors  must  be  IMeyman-Pearson 
(N-P)  [3,  8|.  The  above  result  is  based  on  the 
assumption  that  the  data  at  the  sensors  conditioned  on  the 
hypothesis  are  statistically  independent.  If  the  conditional 
independence  is  removed,  the  threshold  of  the  .N'-P  tests 
become  data  dependent  and  does  not  yield  any  easy 
solution  for  optimization  [16|. 

We  consider  a  serial  distributed  decision  scheme 
(Fig.  2).  (in  [4]  this  is  called  a  tandem  network)  Though 
the  serial  fusion  is  very  sensitive  to  link  failures,  its 
performance  analysis  is  of  interest.  In  [4|,  the  tandem 
network  was  analyzed  with  Baye  s  cost  as  the  optimality 
criterion.  Though  analytical  equations  are  given,  no 
performance  analysis  for  typical  channels  or  comparison 
of  performance  with  respect  to  the  parallel  fusion  was 
provided.  Here  we  aim  to  fill  this  gap. 

In  Section  II  we  derive  the  relevant  equations 
desenbing  the  operation  of  the  serial  scheme  based  on  the 
knowledge  that  all  the  sensors  employ  the  .N-P  test  In 
Section  III  we  show  that  the  global  optimality  is 
guaranteed  when  each  suge  employs  the  N-P  test 
Section  IV  examines  the  conditions  under  which  the 
performance  of  the  serial  scheme  is  definitely  not  infenor 
to  the  parallel  scheme.  Some  numerical  examples  are  also 
presented  to  illustrate  the  performance. 

II.  DEVELOPMENT  OF  KEY  EQUATIONS 

Consider  the  serial  configuration  of  distnbuted  sensors 
shown  in  Fig.  2.  Denote  the  sensor  decisions  as  u  . 

....  ttv  Theyth  sensor  receives  the  decision  u  ,  and  its 
own  observation  Z,  to  make  its  decision  u,.  The  decision 
«v  *t  the  Vth  sensor  is  the  fused  decision  about  the 
hypotheses.  We  assume  that  the  data  at  the  sensors 
conditioned  on  each  hypothesis,  are  statistically 
independent.  This  implies  that  and  ,  are  also 
conditionally  independent.  As  mentioned  earlier,  the  ;th 
sensor  employs  an  N-P  test  using  the  data  iZ  .  u 
The  optimality  of  this  assumption  is  explored  in  the  next 
section. 

Denoting  the  distributions  of  Zj  as  p(Z  H  )  jr.iJ 
p(Z;|Ho).  the  likelihood  ratio  becomes 

HZ,,  M;-i|H,) 

HZj, 
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(1) 


where 


Po,;-l  •  “  MH|) 

P,,.,  -  Pr(«;.,  -  l|Ho) 

Mj. ,  ^  k  implies  that  the  (>  -  l)th  sensor  decides  Ht, 
»  0.  I .  and  6(x)  is  the  Kronecker  delta  function 

defined  as  8(it)  »  I  *  *  *  „  and  f.(  )  is  the  likelihood 

0  t  5*  0 


function  (I4|. 

Therefore,  the  test  at  the  >th  sensor  is  given  by 

H, 

_ _  *  f.  if  M,-i  »  I 

p(Z^|Ho)  Pr.,.i 


ptZ,|H|)  Po,,-i 


p(Z,|H|)  I  -  Pq.,-, 
p(Z,|Ho)  I  -  Pp.^-i 

where  t  a  threshold  to  be  determined. 
Equivalently. 


Ho 

r. 

if  Uj. 

H, 

Ho 

f. 

if  Uj. 

.V(Z, 

where 

.\(Z,) 


Ft , 

')* 

Ho  LVO- 


if  u^-i 
if  u, .  I 


1 

0 


P(Z,\H0 

P<Z;|Ho) 


(2) 


(3) 


fit-  2.  Send  decuioo  fusion. 


and 


Li  =  '  ~ 

Po.,-i 

Vo  Po.,-1  I  “ 

Pf./  - 1 

Many  times  tt  is  convenient  to  use  the  log  likelihood 

ratio.  In  AfZ;)  =  ; 

V*(Z;).  Hence. 

H,  r., 
.\'(Z,)«  V 

Ho  L'.'  O’ 

iftt,.,  »  1 
if  Hj.,  =0 

and 

Ut 

a,  »  Pr(A‘{Z,)  >  t'olHo) 

b,  -  Pr(A*(Z,)  >  »;,|Ho) 

c,  =  Pr(A*(Z,)  >  /;o|H,) 

(4,  4  =  PrtA«(Z,)>/;,|H,). 

Using  (5).  (6).  and  the  conditional  independence 
assumption,  we  have 


(6) 


)i  'jO 
I 

*  In 


(  —'‘l:'.  .  L~  .  j  ^  2 . S. 

\\  ~  Pf.,.,  Pd.,.,  / 

For  the  first  stage,  rf,  »  tf  ,,. 

A.  False  Alarm  and  Detection  Probabilities 

At  the  7th  stage,  the  false  alann  probability  is  giver 
by 


Pp.,  =»  Pr(A*(Zp  >  t'olHo.  -  0)  Pr(tt,.,  «  0|H„i 
+  Pr(A*(Z,)>f;,|Ho.  U;-,  -  1) 

X  Pr(i4,.,  -  l|Ho). 

(5> 


Pf.,  *  a,(l  -  P,,^.,)  +  b,  Pf.,.,.  (7) 

Similarly. 

Pd.,  *  c,(l  -  Pp^.,)  +  ,8, 

Knowing  the  distribution  of  the  observations  Z,  and  using 
(4),  (6)-(8).  It  is  possible  to  compute  the  Pp^s 
recursively  provided  the  P^.^s  are  specified.  If  the  Pp  ,s 
are  kept  the  same,  the  serial  configuration  exhibits  some 
nice  properties  15).  However,  for  a  given  Pp  v  at  the  .Vth 
stage,  this  procedure  does  not  guarantee  a  maximum 
Po..v.  In  order  to  globally  optimize  the  performance,  that 
is  to  maximize  Pp.,,  for  a  given  Ppj».  we  need  a 
multidimensional  search  with  respect  to  the  variables 

Pr,*-./  •  1.  2 . (N  -  I).  The  results  obuined  using 

the  numerical  search  procedure  are  presented  in 
Section  IV. 
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viswanathan  et  al  serial  distributed  decision  function 


In  Fig  3  a  functionally  equivalent  form  of  the  senai 
decision  fusion  is  shown.  Each  sensor,  except  the  Tint 
one.  sends  two  decisions  q  and  u,  ,  depending  on 
whether  the  previous  sensor  decides  a  0  or  a  I . 
respectively  These  decisions  are  arrived  by  using  i3i 
The  fusion  center  uses  the  decision  from  the  first  sensor 
and  sequentially  picks  the  appropriate  decisions  from  the 
sensors  to  arrive  at  the  rinal  decision  Uq  which  is  either 
Uv  0  or  uy  I  Performance-wise,  the  configuration  in 
Fig.  3  IS  equivalent  to  the  senai  scheme  The  equivalent 
configuration  does  not  have  the  time  delay  problem 
associated  with  the  serial  configuration  However,  both 
are  highly  sensitive  to  link  failures 


'‘j 


Fig.  3  Funciioflally  equisaleni  configuration  ot  senai  neivsurk 


III.  GLOBAL  OPTIMALITY 

The  global  optimization  problem  is  to  find  the  tests  at 
each  stage  of  the  senai  configuration  such  that  the 
probability  of  detection  Po  .v  is  maximized  for  a  given 
Pp  v  Here,  we  show  that  the  global  optimality  is 
achieved  when  each  sensor  employs  the  N-P  test 

Theorem  I .  Given  r/tai  the  observations  at  each 
stage  in  a  serial  distributed  detection  environment  with  .V 
sensors  are  independent  identically  distributed  illD).  the 
probability  of  detection  is  maximized  for  a  given 
probability  of  false  alarm,  at  the  Nth  stage,  when  each 
stage  employs  the  N-P  test. 


PnL*  <  A.1H,)  =  Po  Pr(  .\*  *  In  j  ^  )  <  ^  H,  ) 


-  (1  -Po)  Pr(  \* 


Denote  the  cumulative  distnbutions  and  the  densnv 
functions  of  \*  under  H,  and  Ho  as  F’t  i.  /‘i  i  and 
Pifi  ). /?(  ).  respectively  Since  the  left-hand  side  of 
( 10)  IS  one  minus  the  probability  of  detection,  we  have  • 

1  -  Ptsv  =  Po/^r(^  -  '"(g)  ) 


Similarly. 
I  -  Pf  V 


1 1  i 

( i: 


We  require  for  a  fixed  Pp  y  and  for  any  arbitrary  but 
fixed  Pp  at  the  i.V  -  I  )th  stage,  the  Po  y  to  be  a 
monotonic  increasing  function  of  the  Po  at  the  i  \  -  I  ith 
stage.  Observe  that  if  the  Po  of  the  i.v  -  l  ith  stage  is 
changed,  then  the  threshold  \  at  the  .Vth  stage  changes  in 
order  that  Pp  y  remains  fixed.  Taking  the  derivative  of 
(12)  w.r.t.  Po  and  equating  the  result  to  zero,  we  obtain 


d\ 

: 'i 

dPo  ~ 

where 

Pf/JI-Ki)  I 

•  ■  PfI/o  (•»:) 

( 13) 


X,  =  \  -  InlPo/Pp) 


Similarly. 


Proof.  Consider  the  last  two  stages.  At  the  .Vth 
stage,  the  N-P  test  using  the  data  (Zy.  uy  - 1)  maximizes 
Po.y  for  a  fixed  Pp.y  (II.  131.  Let 

piZs,  ttv-i|H|) 


In 


A*(Zy)  =  In 


P{Zy,  tty_  I  IHq) 

P(ZylHt) 

p(Zy|Ho) 


d?o 


(9) 


Fr(-t|)  -  Ffix,) 


p,. 


1  Ui 


Call  .V*(Zy).  Pp.y- 1,  and  Pd.,v-i  as  A*.  Pf  and  Pq.  A  reasonable  requirement  is  Pq  >  Pp  This  impiiex  that 

respectively,  for  simplicity.  Then.  FT(.X|)  -  FTc.ts)  is  less  than  zero.  Hence,  a  sutficiem 
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•  < 


3«8 


</Prj  V 

condition  for  ■  >  0  is  that  the  term  in  the  brackets 

<<Pd 

in  ( 14)  be  less  than  or  equal  (o  zero.  After  some 
simplification,  using  1 13).  we  obuin  the  following 
sufficiency  condition: 


/r<ri) 

However,  from  the  revolt  that  the  likelihood  ratio  of  the 
likelihood  ratio  is  the  likelihood  ratio  itself  (II.  pp  46|. 
It  follows  that  1 15i  IS  satisfied  with  equality. 

IV.  PERFORMANCE  ANALYSIS 
A.  Numerical  Results 

By  using  the  algonthm  developed  in  Section  11.  we 
can  obtain  the  best  Pq  v  for  a  given  Pp.v  by  using  a 
search  procedure  on  the  variables.  Pp ,.  i  >  1 . 


(A/  -  I).  We  have  recursively  used  the  one -dimensional 
optimization  routine  FMIN  [IS]  for  this  purpose  The 
algorithm  also  requires  the  zero  of  a  function  m  order  to 
obtain  the  thresholds  at  each  suge  (7)  The  ZEROl.N 
routine  in  (IS)  is  used  to  solve  for  the  zeros  The 
convergence  to  the  optimum  value  is  obtained  in  the  case 
of  2  sensors  and  3  sensors.  For  performance  companson. 
we  also  considered  the  following  parallel  fusion  schemes 
two  sensors,  identical  thresholds  at  the  sensors.  AND. 

OR  rules,  and  three  sensors,  identical  thresholds  at  the 
sensors.  AND.  OR.  majority  logic  rules.  In  the  three¬ 
sensor  case  we  also  consider  two  other  rules,  termed  F I 
and  F2.  F\  conesonds  to  the  Boolean  function  ug  =  u, 

UjU)  and  F2  corresponds  to  ug  *  U|(u.  U3I  For 
FI  and  F2.  sensors  numbered  2  and  3  operate  at  the 
same  thresholds.  In  all  the  cases  the  observations  at  the 
sensors  are  taken  to  be  UD.  Two  channel  models,  namelv 
the  constant  signal  detection  in  additive  white  Gaussian 
noise  (AWGN)  and  the  detection  of  a  slowly  fluctuating 
Rayleigh  target  (3.  12)  are  considered. 

Figs.  4-6  show  the  performance  of  two  sensors  m 
AWGN  channel  and  Figs.  7-9  show  the  performance 
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with  three  sensors.  The  curve  named  parallel  is  the  best 
of  the  several  parallel  decision  rules  mentioned  above  and 
the  dau  fusion  corresponds  to  the  centralized  detection 
scheme  which  uses  dau  available  at  all  the  sensors.  With 
two  sensors,  the  serial  performs  better  than  the  parallel, 
especially  at  larger  sigiial-to-noise  ratios.  With  three 
senson.  the  performance  of  the  two  schemes  are  nearly 
the  same.  Also,  either  of  them  is  poor  compared  with  the 
dau  fusion.  This  is  due  to  the  loss  associate  with  the 
distributed  detection.  In  Rayleigh  target  detection  with 
two  or  three  senson.  the  OR  rule  is  better  than  the  rest  of 
the  parallel  fusion  rules.  Moreover,  the  numerical 
compuution  shows  that  the  serial  is  equivalent  to  OR  for 
this  channel.  Theoretically  esublishing  the  equivalence 
has  not  been  possible.  In  the  sense  that  the  serial  is  only 
as  good  as  the  OR  rule,  one  can  term  the  Rayleigh 
channel  as  conservative  (Theorem  2  in  the  next 
subsection  implies  that  the  serial  should  be  at  least  as 
good  as  the  OR  rule).  Figs.  10-13  show  the 
performances  of  different  schemes  for  the  Rayleigh  target 


detection.  In  Figs.  13-15,  the  performances  of  F\  and 
F2  are  equivalent  and  hence  the  corresponding  graphs 
coincide  with  each  other. 

B.  Comparison  with  Parallel  Scheme 

An  optimal  parallel  huion  is  the  parallel  scheme  of 
Fig.  I  which  gives  the  largest  possible  probability  of 
detection  for  a  given  probability  of  false  alarm  at  the 
fusion.  Only  a  monotone  increasing  switching  function, 
called  the  positive  unate  function  (I7|,  qualifies  as  a 
candidate  for  the  optimal  fusion  switching  function  This 
can  be  easily  prov^  from  the  requirement  that  the 
optimal  scheme  employs  likeliho^  ratio  test  at  the 
fusion.  One  property  of  monotone  increasing  function  is 
that  function,  when  expressed  as  a  sum  of  products  does 
not  conuin  any  complemented  variables.  A  switching 
function  which  can  be  expressed  as  a  sequence  of  two 
input  and  one  output  functions  is  a  positive  unaie  function 
and  hence  qualifies  as  a  candidate  for  the  optimal  parallel 
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fusion  function.  An  example  of  one  such  switching 
function  of  three  variables  is  shown  in  Fig.  16.  Fig.  16 
also  shows  the  serial  scheme  with  three  sensors. 

Theorem  2  (given  below)  establishes  a  sufficient 
condition  for  the  performance  of  the  optimal  serial 
scheme  to  be  not  inferior  to  the  performance  of  the 
optimal  parallel  scheme. 

Theorem  2.  If  the  switching  function  corresponding 
to  the  optimal  parallel  fusion  can  be  realised  in  terms  of 
a  sequence  of  two  variable  functions  with  single  output, 
then  the  optimal  serial  scheme  is  b  -tter  than  or  equal  to 
the  optimal  parallel  scheme. 

Proof.  Consider  the  conservative  situation  in  which 
the  decision  variable  U|  in  Fig.  I6<a)  and  (b)  ate  identical 
and  each  stage  of  the  serial  xheme  operates  at  the 
corresponding  falx  alarms  of  the  parallel  xheme  (in  the 
Appendix  we  show  that  it  is  possible  to  xhieve  such  an 
operation).  The  My  in  Fig.  16(b)  is  a  function  of  U|  and 
the  obxrvation  Zy.  Since  the  mapping  of  (U| .  uy)  to  uy  in 


the  parallel  is  contained  in  the  mapping  of  (u,.  Zy)  to  liy 
in  tlw  xrial.  the  detectioa  power  Pg  y  attained  at  Pp :  m 
the  xrial  is  greater  than  or  equal  to  Pq  y.  Similarly,  uq  m 
the  parallel  is  a  function  of  My  and  mj  only  whereas  in  the 
xrial  it  is  a  function  of  My  and  the  obxrvation  Zy.  It  is 
obxrved  that  the  dy  of  the  serial  has  the  same  falx  alarm 
Pp  y  of  the  parallel  but  has  a  greater  than  or  equal  power 
For  the  xrial  cax.  the  proof  of  Theorem  1  shows  that 
the  detection  probability  of  any  stage  operating  at  certain 
falx  alarm  is  a  monotone  nondecieasing  function  of  the 
detection  probability  of  the  previous  suge  operating  at 
xme  falx  alarm.  It  then  follows  that  P|^  o  ^  greater  than 
or  equal  to  Pq.o-  By  induaion  the  proof  is  complete  for 
any  N.  Conservatively  it  is  assumed  that  the  false  alarm 
at  exh  stage  of  the  xrial  is  identical  to  the  one  m  the 
parallel  xheme.  If  the  serial  xheme  falx  alarms  are 
optimized  then  defmixly  P^  o  cannot  be  less  than  Po  u 
From  Theorem  2.  we  obxrve  that  for  the  cax  of  two 
xnsors.  the  optimal  xrial  is  better  than  or  equal  to  the 
optimal  parallel  xheme.  With  three  xnsors.  it  is  better 
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Fig.  16lal.  Example  of  two  input  and  one  output  parallel  fusion 
function  with  ihicc  sensors. 
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than  or  equal  unless  the  optimal  parallel  is  a  majority 
decision  logic.  In  such  a  case,  only  an  actual  performance 
assessment  determines  which  is  better.  As  mentioned 
earlier,  in  the  case  of  Rayleigh  channel  with  two  or  three 
sensors,  the  numerical  results  show  that  the  optimal  serial 
is  just  equivalent  to  OR.  In  this  sense  the  Rayleigh 
channel  can  be  termed  conservative.  Also,  in  Figs.  7-9. 
over  the  range  of  false  alarms  where  the  parallel 
outperforms  the  serial,  the  best  of  the  pandlel  is  the 
majority  decision  rule.  In  the  range  where  serial  is  better, 
the  best  of  the  parallel  belongs  to  the  class  of  Theorem  2. 


V.  CONCLUSION 

A  serial  distributed  network  of  N  sensors  detecting  the 
presence  or  absence  of  a  signal  is  analyzed.  When  the 
sensor  observations  conditioned  on  the  hypothesis  ate 
sutistically  independent,  the  senson  employ  N-P  test  for 
maximizing  the  detection  probability  for  a  given  false 
alarm  probability  at  the  Afth  stage  (Theorem  I).  For 
certain  noise  distributions,  the  parallel  structure  requiring 
its  fusion  Kheme  to  belong  to  a  certain  class  of  switching 
ftmetions.  is  not  superior  to  the  serial  scheme 
(Theorem  2).  As  a  drawback,  any  serial  network  is 
vulnerable  to  link  failures.  Some  numerical  examples 
illustrate  the  performance  of  the  optimal  serial  decision 
scheme. 

In  the  case  of  Rayleigh  target  detection  with  two  and 
three  sensors,  the  performances  of  the  serial  and  the  OR 
fusion  rule  ate  eqt^.  For  AWGN  channel  and  two 
sensors,  the  serial  performs  better  than  the  parallel. 
However,  with  three  sensors  the  performance  is 
essentially  the  same.  It  is  not  known  whether  there  exists 
any  channel,  practical  or  hypothetical,  such  that  the  serial 
is  better  than  the  parallel  for  a  distributed  network  with 
three  or  more  sensors.  Considering  the  complexity  of  the 
serial  scheme  and  the  results  from  this  limit^  study,  the 
choice  seems  to  favor  the  parallel  fusion  for  the 
distributed  detection  problem. 


APPENDIX 

It  is  shown  here  that  any  false  alarm  is  realizable  at 
any  stage  of  a  serial  configuration.  Let  us  denote  for 
simplicity  Pp,-,.  Pf,.  Poy-i  -  V'  -  '/.o-  o,.  and  b  by  o. 
Oo.  3.  r, .  Iq,  a.  and  b.  respectively  Therefore,  using  i2) 
and  (3).  and  (7) 

Oo  »  (I  -o)a  +  o  b 

I  -  a 


f,  »  /  |  (All 

The  likelihood  ratio  .\  (from  (3)1  and  hence  u  and  h  are 
continuous  fuiKtions  of  t.  Hence,  for  a  fixed  a.  a.,  is  a 
continuous  function  of  i.  Let  the  support  of  ihe 
distribution  of  A  be  between  /,  and  i  r,  ^0  and  £ 

^).  As  <0  approaches  t|.  a.  b,  and  Oo  approach  I  and  as  r, 
approaches  r^.  a.  b.  and  a^)  approach  0.  Therefore,  any 
ciq  in  (0.  1 )  can  be  obtained. 

Please  note  that  the  method  employed  here  is 
suggested  by  one  of  the  reviewers. 
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aa  tea  ra<*  Eaia  araiaMa  la  51  aatjr.  ar  E  ma/,  aaiar  cariaia  raqatM 
CMMiMiMMa  sIm  IBIM  UMB  BBCMBI  tkt  4BCiltM  B|  tf  MBMC  SZ>  RuBBM 
aaE  aaaraaEaw  rnaait  idtmmtt  an  aaalyaaE  aaE  aawahral  laaaEa  aw 
prataaiaE  aaE  caaiaaiaE  ter  Caaaataa  aaE  ilw  teteag  EasMcE  chaa- 
atia.  Far  tadi  aaditaa  ■atlaj  KiMaM.  aa  aiiadataE  oiKtaiuailaa 
praataai  ia  tef  alalaE  iteaaa  aatalioa  ia  ilwa  la  laiiaiy  ctruia  a  ^nari 
tai  dariRB  criMrla  teal  at  caaiidar  aaaaatlal  ter  aaaaar  teiiaa 

I.  iNTRODUCnON 

Considerable  research  has  been  focused  lately  on  the  problem 
of  distributed  decision  fusion  (1)-[6|  where  a  number  of  dis¬ 
tributed  sensors  receive  data  from  a  common  volume,  come  up 
with  a  first-stage  decision,  and  then  transmit  their  decisions  to  a 
fusion  center  which  arrives  at  the  final  decision  by  fusing  the 
sensor  decisions  (or  some  form  of  compact  information  received 
from  the  sensors).  The  mam  assumption  in  the  bulk  of  the 
related  literature  is  that  the  transmission  of  information  from 
sensor  to  fusion  (and  possibly  the  opposite  way)  is  done  at  no 
cost.  This  implies  that  exchange  of  information  between  the 
sensors  and  the  fusion  is  possible  at  rates  limited  only  by  the 
physical  bounds  of  the  channel  capacity.  The  main  emphasis  is 
then  placed  on  determining  the  optimal  sensor  configuration 
(parallel,  serial,  or  combination)  [SHbL  and  the  fusion  logic 
(AND.  OR,  etc.)  for  an  array  of  sensors  [Sl-(6]. 

The  problem  of  learn  decision  with  risk  is  common  in  C’ 
(command,  control,  and  communications)  applications  (71,  but 
not  limited  to  those  (13).  Practical  application  areas  for  team 
decision  with  risk  extend  to  other  fields,  such  as  medical  diagno¬ 
sis,  cryptography,  etc.,  where  exchange  of  information  among 
decision-makers  is  not  free  and  communicatioo  cost  is  a  factor. 
The  communication  cost  can  translate  into  the  risk  of  revealing 
one's  position  in  applications,  actual  bandwidth  limitations 
for  transmission  in  bps  (bits  per  second),  cost  in  dollars  of  a 
leased  communication  line  in  commercial  applications,  or  a 
consultation  fee  for  the  procurement  of  an  expert  opinion  by  a 
consultant. 

The  problem  of  distributed  detection  in  the  presence  of  com¬ 
munication  cost  has  also  been  considered  by  Papastavrou  and 
Athans  (7).  (n  their  formulation,  they  consideted  symmetric 
operation  schemes  for  both  the  prunaiy  and  the  consulting 
sensors,  in  a  way  that  ignorance  could  be  the  end  result  of  an 
exchange  of  information  between  the  senson  even  if  a  price  tag 
was  associated  with  the  information  exchange.  A  general  cost 
was  then  attached  to  each  decision  under  the  tested  hypotheses, 
and  the  likelihood-test  was  shown  to  be  the  optimal  decision 
rule  under  tfte  given  operating  schemes  (8). 

In  this  note,  we  consider  the  problem  of  distributed  decision 
making  with  nro  consulting  senson  in  which  every  inter-sensor 
communication  incun  some  risk,  thus  making  continuous  sensor 
communication  a  very  expensive  and  prohfliitive  proposition.  We 
are  interested  in  determining  the  optimum  decision  scheme 
when  the  structure  of  the  consultation  scheme  is  specified  given 
that  a  certain  amount  of  risk  (or  communicatkm  cost)  can  be 
tolerated.  Given  the  structure  of  the  consultation  scheme,  we 
seek  optimal  decision  rules  that  minimize  cost  functionals  that 
involve  the  probability  of  false  alarm,  the  communication  cost, 
and  the  probability  of  miss.  Different  possible  formulations  are 
being  discussed  in  this  note. 

II.  Team  Decision  Schemes 

The  team-decision  scenarios  that  we  analyze  in  this  note 
consist  of  a  dual-sensor  system  and  binary  hypothesis  testing  as 


Fig.  t.  Dual-sensor  coohfuration  in  consultaiion 


in  Fig.  1.  Due  to  bandwidth  limitations  and  the  sensitivity  of  the 
data,  no  transmission  of  raw  data  between  the  two  sensors  is 
allowed.  The  sensors  only  exchange  request  signals  and  deci¬ 
sions.  (Additional  quality  information  bits,  such  as  the  degree  of 
confidence  associated  with  each  decision,  could  have  also  been 
included  in  the  scenarios  that  are  considered  without  affecting 
the  structure  of  the  tests  significantly.)  We  present  numerical 
and  some  analytical  results  only  for  the  cases  where  the  primary 
sensor  transmits  request  signals  to  the  consultant  sensor,  whereas 
the  latter  relays  only  its  binary  decisions  back  to  the  primary, 
and  no  exchange  of  quality  information  bits  takes  place.  Ran¬ 
dom  consultation  and  nonrandom  consultation  schemes  are  con¬ 
sidered. 

In  the  analysis  that  follows,  we  assume  that  the  probability 
distributions  of  the  observations  for  both  sensors  under  either 
hypotheses  are  absoluiefy  conanuous  with  respect  to  the 
Lebesgue  measure  and  that  the  associated  likelihood  ratios  are 
piecewtse  conmuous  functions  of  the  thresholds.  Furthermore, 
we  assume  that  the  decisions  of  the  primaiy  and  consulting 
sensors  are  mutuaUy  mdeptitdau  conditioned  on  each  hypothesis. 
Numerical  evaluation  of  the  optimal  solutions  for  different  for¬ 
mulations  is  performed  in  additive  Gaussian  noise  channels  [9] 
and  slow-fading  Rayleigh  channels  (3),  (10).  The  following  nota- 
uons  wilt  be  used  in  the  sequel. 

Notations 

Po,  »  Detection  probability  of  sensor  Si  operating  alone, 
i  -  1,1 

/’m,  ~  Mia  probability  of  sensor  Si  operating  alone,  i  - 

l.l 

Pf,  -  False  alarm  probability  of  sensor  Si  operating  alone, 
i  -  l.l 

^oiz  »  Detection  probability  of  51  and  52  in  consultation. 

^012  -  Mia  probability  of  51  and  52  in  consultation 

Pf,2  “  Ptix  alarm  probability  of  5l  and  52  in  consultation. 

Po  ~  Team  detection  probability. 

p^  -.m.  Team  mia  probability. 

Pf  -  Team  Mae  alarm  probability. 

Pe  •  Requea  probability  (it  determines  the  consultation 

level). 

In  the  nonrandom  consultation  case,  explicit  reference  to  the 
sensor  threshokfis)  will  be  required.  The  notation  Px^ii'*  ~  Pi(, 
and  PxU' )  =—  P^i,  X  F,  M,  or  D  and  i  -  1.2,  will  be  used  to 
indicate  the  falK  alarm  {X  ••  F),  mia  (X  MX  or  detection 
(X  •  D)  probabilities  of  sensor  Si  operating  at  thresholds  r  or 
t".  The  notation  PJ^  and  P^i  will  be  used  to  make  the  expres¬ 
sions  more  compact  when  needed. 

(II.  Random  Consultation  Schemes 

A.  Random  Consultation  with  Fixed  Probability  and  Reprocessing: 
Problem  Formulation 

The  primaiy  sensor  51  consults  52  randomly  with  j  fried 
probability  of  request  P/,.  When  52  if  consulted,  it  relavs  its 
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dcciiioa  to  51.  which  in  turn  rcprocesws  it  with  its  own  raw  data 
in  order  to  cone  up  with  the  final  decision.  The  objective  is  to 
mimraize  the  team  misf  probability  (equWalently.  maaimizc 
the  probability  of  detection  P^,)  for  fised  false  alami  probability 
Pf.  The  distinguahinfi  feature  of  this  scheme  is  that  the  decision 
to  consult  is  random  and  is  made  independently  of  the  degree  of 
confidence  that  sensor  51  may  have  on  its  initial  decision  u,. 
The  major  advantages  of  the  scheme  are  that:  a)  it  is  simple  to 
analyze;  and  b)  its  performance  does  not  dtptnd  on  die  pnor 
probabtliats  of  the  two  hypotheses  which  may  very  often  be 
unknown  in  C’  and  other  applications. 

The  optimal  random  consultation  scheme  is  equivalent  to 
switching  between  the  ROC  (receiver  operatiiig  charactenstic) 
curve  of  51  alone  [9]  and  the  ROC  of  the  serial  combination  of 
51  and  52  [61  (Fig.  2)  according  to  a  specified  request  probability 
Pi,,  so  that  the  probability  of  detection  is  masimized  for  a  fixed 
team  false  alarm  probability  a,.  (For  the  reader's  oonvcnieiice, 
the  optimal  decision  test  for  serially  connected  sensors  is  sum- 
marizMi  in  the  Appendix.)  The  team  probabilities  are  easily 
obtained  as 

~  F/»)  +  (1) 

(2) 

f’/r-^id  -/*»)+ /•pu/’a  (3) 

where  1  in  the  subscript  indicaies  the  sensor  51  operating  alone, 
and  12  the  serial  combination  of  51  and  52  to  be  deaig^ied  as 
512  hereafter.  The  random  consultation  decision  problem  is 
mathematically  formulated  as  foUows: 

Maximize  Pg  s.i.  P^  •  a,  and  0  sPg  i  fin-  (PI) 

Using  Lagrange  multipliers  oi,  and  W],  the  constrained  mazi- 
mizatioo  problem  (f  1)  is  converted  into  the  unconstrained  max¬ 
imization  problem 

max  y  -  Pp  «,(ao  -  /»J  +  i«2((  fit  -  P„)Pi,  - 

(PH) 

where  is  a  positive  slack  variable  that  is  used  to  convert  the 
inequality  constraints  on  P„  into  an  equivalent  equality  con¬ 
straint.  The  maximization  in  (Pl.l)  is  understood  with  respect  to 
the  choice  of  operating  points  of  51  and  52,  and  the  level  of 
consultation  P,,. 


B.  Random  Consuitation  Optimal  Sofudon 


Theorem  I:  If  the  ROCs  of  51, 52,  and  the  serial  combination 
of  51  and  52,  512  [6]  are  strictly  concave,  then  the  optimal 
solution  to  problem  (Pl.l)  and  thus  (PI)  involves  a 
Neyman-Peatson  (N-P)  test  under  either  stand-alone  or  serial 
modes  of  operation.  The  optimal  operating  points  are  given  as 
solutions  to  the  equations 


dP 91 

^Poi2 

dPfX 

(4) 


“o  “  ^pi(l  ~  Pk) 

(^Olt  ~  ^Pl)  W|(Pyi  -  PfXl)  ^  ft 


(5) 

(6) 


(■>2  U.  »  0. 


C’) 


Fig.  2.  Receiver  operadog  chaisctensixs  (ROC)  (or  different  le^eli  of 
random  request 


(6)  implies  that  P^^  fit  when  wj  w  0.  which  is  true  if  Pq,,  > 
Pq.  The  solution  w,  -  0  implies  P^  •  0,  which  is  the  solution 
when  Poij  -  Pot  nnd  P„2  -  Pf,-  Furthermore,  under  the  con¬ 
tinuity  assumptions,  the  optimal  solution  is  unique. 

Proof:  Under  the  assumption  that  the  ROCs  of  51.52.  and 
the  serial  combination  512  are  strictly  concave,  the  N-P  test 
maximizes  the  probability  of  detection  at  each  one  for  any  fixed 
false  alarm  probability  (91  Thus,  for  any  P^,  and  Pf^  that 
satisfy  the  constraint  Pf  ••  Oo*  *dd  for  any  ft,  the  detection 
probability  is  maximized  if  the  N-P  test  is  used  under  both  the 
stand-alone  and  serial  modes  of  operation.  Substituting  Pg  and 
Pf  in  (Pl.l)  from  (1)  and  (3),  and  differentiating  J  with  respect 
to  P,|  and  Pfn,  (4)  is  obtained.  Differentiating  J  with  respect 
to  P„,  setting  the  result  equal  to  zero  and  solving  for  Pf.  lb)  is 
obtained.  Differentiation  of  (Pl.l)  with  respect  to  m  results  in 

(7) . 

From  (7).  it  follows  that  •  ( P*  ~  ft "  0  when  iu-_  •  0. 
However,  from  (6).  «2  ^  0  implies  that  Pf  *  0.  Hence.  Pf  -  Bo 
in  order  to  satisfy  •  0.  On  the  other  band,  from  (6).  w.<  0 

if  and  only  if  Pj,,,  -  P®,  and  P^  -  P„.  in  which  case  P,  -  0. 
and  thus  ^  -  0  as  well. 

The  uniqueneas  of  the  optimal  solution  follows  from  the 
absolute  continuity  asMimption  and  the  concavity  of  the  ROC. 
from  which  it  foU^  that  (iPo\/iPf\)  and  OPdk/ <^Pf 
stnctly  monotonk  functions.  Hence,  for  each  u,.  there  exist 
unique  points  on  51  ROC  and  on  512  ROC  for  which 
are  satisfied.  - 

C.  Numerical  Results 

Numerical  results  of  the  optimal  solution  to  problem  iPl  >  m 
additive  Gaussian  noise  channels  and  slow-fading  Rayleigh 
channels  are  given  in  Fig.  3  for  different  request  rates  The 
numerical  results  throughout  the  note  are  obtained  assuming 
the  following  statistical  models  for  the  two  channels. 

Gaussian; 

Observation  model  at  each  sensor  r  -  (7(0. 1);  Hg.  and  ^  - 
G(s.  1):  Hf.  where  (7(a,  fi)  designates  an  a  mean  and  variance 
fi  Gaussian  distribution.  If  (»  is  the  threshold  at  the  sensor,  the 
operating  false  alarm  and  detection  probabilities  iPf.P.j'  are 
given  by 

False  alarm  probability:  Pf  •  QOt) 


Hence,  the  optimal  solution  involves  two  N-P  tests  operating 
at  points  of  the  51  ROC  and  the  512  ROC  with  equal  slopes 
that  satisfy  Pf  -  og  and  0  s  Pf  &  fit-  Condition  (7)  along  with 


Detection  probability:  Pq  “  C(»a  -  'f*)  “  Q[Q  '  (  P,^ )  -  .  <  | 
where  C()-l-^()isthe  cumulative  distribution  function 
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(CDF)  of  the  suadaid  norauU.  Q'  ‘  ia  invene,  and  f  •  SNR  at 
the  lenaor  in  decibels. 

I 

Fabe aiarm pmbQbdiiy:  Ff  »  [A(t  -«■ 

I 

Deletion  probabHity:  Fq  “  [  1*  >  ‘ 

where  A  is  the  threshold  used,  and  «  the  SNR  at  the  sensor  in 
decibels. 

From  Fig.  3,  it  is  easy  to  see  that  the  optimal  solution  to 
problem  (PI)  is:  a)  monotonk  with  respect  to  the  information 
(used  independent  of  the  quality  of  the  sensors:  b)  monotonic 
with  respect  to  and  c>  independent  of  the  e  prion  uncer¬ 
tainty.  These  properties  are  anat^ically  proven  in  [12]. 

D.  Random  Consultation  Suboptimal  Solution 

A  suboptimal  solution  to  problem  (PI)  is  obtained  if  P„  and 
Ffi2  are  constrained  to  be  equal,  thus  equal  to  og  accor^g  to 
(3).  The  suboptimal  solution  to  problem  (PI)  involves  N-P  tests 
for  both  51  ^  512  as  well.  The  suboptimal  operating  point  is 
given  as  a  point  benveen  the  51  and  512  R(X  curves  at  level  Og 
determined  by  the  equality  Pr  <■  0g  (Fig.  2).  The  system  Fq  - 
P|)/(  ~  Poij  ^  [12].  Numerical  remits  of  the  suboptkul 
solution  to  (PI)  in  Gaussian  and  slow-fading  Rayleigh  chaiuiels 
are  shown  in  Fig.  4  for  “  0-25  and  0.7S.  For  comparison,  the 
optimal  random  consultation  ROCs  for  the  same  ^ues  of  /Sg 
are  overlayed  in  the  same  figure.  The  RCXTs  of  the  optimal 
random  consultation  scheme  are  slightly  (but  visibly)  superior  to 
the  ROCs  of  the  mboptimal  scheme  for  the  Rayleigh  channel 
but  almost  identical  (superior  only  on  the  third  significant  digit, 
not  visible  in  the  plots)  to  the  suboptimal  scheme  for  the 
Gaussian  channel 

IV.  Nonrandom  Consultation  Schemes  WrmouT 
Retrocessing 

A.  Operating  Scenario 

In  the  nonrandom  consultation  schemes  we  assume  that  the 
decision  to  consult  is  made  only  when  the  initial  dedsion  u,  of 
51  falls  within  the  indecision  region  (see  below  for  definition), 
otheiwise,  u,  is  taken  as  final  if  it  falb  outside  the  region  of 
indecisioR.  While  several  different  operating  scenarios  are  posai- 
ble.  we  are  only  concenied  with  the  case  in  which  51  may 
consult  52  but  does  not  relay  any  quality  infonnatioa  regarding 
its  initial  findings.  When  requeued,  52  ptocemes  its  own  raw 
data  taking  into  account  the  fact  that  it  has  been  coosulted.  and 
transmits  its  decision  Uj  to  which  then  treats  it  as  the  final 
decisxm.  Hence,  no  reprocesung  takes  place  at  51  after  consul¬ 
tation.  ‘ 

We  constrain  the  consultation  schemes  to  the  foUowing  dasa. 
Let  A,(r,) (P(rJ/f,)/P(r,|ffg))  designate  the  likelihood  ratio 
(LR)  at  the  ith  sensor  using  dau  r„  i  •  1, 2.  Ainme  that  51  has 
an  uncertainty  region  (tj.r;).  When  A,(r,)  >  r^.  51  decides  in 
fovor  of  ff,.  When  A,(r,)  <  t|,  51  decides  in  favor  of  ffg.  In 

'  A  more  symmetric  scenario  than  the  one  used  in  random  oonsulta- 
doo  would  call  for  reprocaming  of  Uj  by  SI  along  with  ha  own  raw  dam 
during  conauftation.  However,  the  performanoe  of  the  symmetric  sce¬ 
nario  would  be  ve^  doae  to  the  nonaymmetric,  nonrandom  requeat 
scheme  conaideted  in  this  note,  aa  it  can  be  seen  from  Fig.  9  where  the 
performance  of  the  serial  scheme  (which  coneaponda  to  the  optimal 
nonrandom  request  scheme  when  the  consultation  rate  ia  1(X)%)  is 
compared  to  the  optimal  nonaymmetric.  nonrandom  request  scheme  at 
opti^  conauhation  rate. 


Fig.  3.  (3ptimal  random  conauhation  detection  probability  versus  SNR 
for  false  alarm  probabilhy  AO  ~  0.001  and  dillerctu  request  rates. 
Chanoela:  Gauaaian  (solid)  and  aiow-fadii^  Rayieigh  (dashed). 


Fig  4.  Comparison  of  detection  ptobabditica  for  the  suboptimal  non- 
random  scheme  for  request  probabilities  0.23  and  0.73.  Channels:  Gauss¬ 
ian  (solid)  and  Rayleigb  (dashed). 

either  case,  no  consultation  takes  place  and  the  decision  of  51  is 
final.  When  Atfr,)  e  U{,i’X  51  consults  52  without  transmit¬ 
ting  any  quality  information  about  its  preliminary  decision  to  52. 
When  52  is  consulted,  it  procesaea  its  data  using  an  LRT 
conditioned  on  the  event  that  5rs  decision  falls  in  the  indeci¬ 
sion  regioo  /,  induced  by  the  fact  that  it  has  been  consulted,  and 
relays  its  dedsion  Uj  to  51  which  takes  it  as  final  for  the  entire 
system.  Thus,  51  decides  according  to  the  following  scheme; 

^i('’i)  £  ‘i-  choose/Zg 
i[  <  A,(i*,)  <  r[ :  choose  /  (Ignorance) 

A,(r,)  £  >7:  choose //,  (8) 

while  52  employs  the  familiar  likelihood  ratio  test  given  by 

Hi 

A2('’2.«‘t  (9) 

H, 

If  ug  denotes  the  final  decision  of  the  system,  the  overall  miss 
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probability  ^ 

!»„  -  •  0I«,) 

-  E/»(«o-OIb,.H, )/»(-,!«,) 

tt) 

~F(Uo-0lu,  -  !,«,)/»(«,-  ll«,) 

7-  /'(tto  -  0I«,  -  0.  f/,)P(u,  -  OlH,) 

+  /'(uo-0la, -/.«,)>’(“i -/l«i)  OO) 

The  lint  pan  of  the  right-hand  side  of  (10)  equals  zero,  while 
the  second  and  third  parts  can  be  siinplifled  to  give 

-  OIH, )+/»(«,  -OItt,  -/.«,)/»(«,  -/IH,). 

(M) 

Expressing  in  terms  of  the  likelihood  ratio  A,(r,)  and 
Aj(r;,u,  »  /),  we  get 

P^^j  dP(A,(r,)|//,) 

+  /  d#‘(Aj(r,,M, -/)lH,)r'<tf‘(A,(r,)|//,).  (12) 

Dropping  the  arguments  from  the  LR's  A,(r,)  and  AjCr^.u, 
*  /)  for  noational  compactness,  an  expression  for  is  ob¬ 
tained  horn  (11) 

fg-/  dP(A,|H,)*-/  df*(Aj|H,)/''dP(A,|«,) 

•'AiXj 

(13) 

Similarly,  for  the  overall  false  alarm  probability  we  obtain 
Pf^(  dP(\\Ho)*(  df(AjlH,)/'’dP(A,|Ho) 

■'ax;  •':><!  ■'<( 

(14) 

and  for  the  ptxibability  of  request 

Pk  -  f'dPiA,)  -  /'■  [dP(A, !//,)/»,  +  dP(A,l/f,)(l  -  P,)). 

•'ll  ■'i; 


Note  that  it  is  necesaaiy  to  exprew  the  likelihood  ratio 
Atli’f  B|  •  /)  in  terms  of  AjIpj)  in  order  to  be  able  to  evaluate 
the  integrab  dP(Ajl/fo)  /a,>«, ^A,|//,).  Taking 
into  account  the  assumptioa  that  u,  and  iij  are  independent 
onditioned  on  each  hypothesis 


jr''dP(A,i«.) 

A2('’2)il2— -  “Ij. 

/'<fP(A,IH,) 

'i; 


Therefore,  it  follows  that 
/  dP(Aj(rj.u, -/)IH,) 

'i>‘i 


f(»-2lW,)l»(R.  - /IW.) 
^('■zlWo)^(‘‘.  -  W  H.  ’  ' 


/;'dP(A,|H,) 

A2(r2.  B,  -  /)  -  A,(rj)-il: - «  (17) 

/  ■  dP(A,|ffo)  "• 

•'ll 


-/  dP{\j(r,)\H,).  for  1-0.1.  (19) 

Using  (19)  and  the  more  convenient  noution  with  the  thresh¬ 
olds  t\,t\  of  51.  and  (j  of  52  explicitly  indicating  the  correspon¬ 
dence  between  the  mode  of  operation  and  the  related  probabili¬ 
ties  (12)-<I5)  take  on  the  more  compact  form 

-  M'i)  (20) 

/»£>  -  Poit’x)  ^  )1  (21) 

p,  -  /v(';)  +  p,i‘i)iPf{t\)  -  PfO’)]  (22) 

^  -  PMt\)  -  M»;)i  +  (1  -  f’«)[f*B('r)  -  PxO’M 

(23) 

Note  that  the  expressions  for  P^,  P,  and  rre  sub)ect  to 
the  constraint  r*  2  i|  which  in  turn  implies  that 

M»:)  a  iod  P,*(r;)  &  P^O’i)-  (24) 

In  the  nonrandom  consultation  framework  described  above, 
the  team-decisioo  problem  can  be  formulated  as  a  constrained 
or  unconstrained  optimization  problem.  A  number  of  different 
formulations  are  meaningful  depending  on  the  application  and 
the  objective.  Using  (20)-(23X  and  the  constraint  (24).  it  is 
possible  CO  decermine  the  optimum  Uueehokb  t;,  and 
numerically  for  a  wide  range  of  formulations,  in  this  note 
however,  we  are  only  concerned  with  one  nonraiKlom  consulta¬ 
tion  formulation.  Actional  nonraiKlom  consultation  formula¬ 
tions  and  numerical  results  are  available  in  [12]. 

8.  Probkm  Fomuilation 

We  formulate  the  nonrandom  consultation  decision  making 
problem  as  follows. 

Maanuze: 

Po  subject  to  P^  -  a,  and  Pn  £  ^o  (1^) 

The  inequality  constraint  in  P^  and  the  N-P  test  optimal 
sohitioo  to  each  subproMem  of  51  operating  alone  or  52  in 
consultation  with  51,  guarantee  the  existence  of  the  optimal 
solution.  However,  the  optimal  solution  to  problem  (P2)  cannot 
be  obtained  analytically.  Using  numerical  techniques,  the  opti¬ 
mal  solution  to  problem  (P2)  (Le.,  the  optimal  thresholds)  can  be 
obtained  via  a  search  algorithm.  Using  the  more  compact  nota¬ 
tion  P'xi  and  Pxi  frtnn  the  earlier  defrned  notations.  (22)  and 
(23)  are  written,  respectively,  as 

-/7il  (25) 

and 

^  -  Wt  -  ^i]  +  (1  -  ^*)l/*;i  -  /’ill  S  00  (26) 
The  maximum  Pg  is  found  by  searching  over  P^,  in  the  range  of 
(0. 1)  and  using  (25)  and  (26)  to  determine  P/,  and  PZ;.  subject 
to  the  constraint  Pfx  2  fp,  (since  r*  2  r;). 

Lemma  1:  Let  r,.,  be  the  optimal  threshold  of  51  for 
problem  (P2)  when  ^  •  0,  ix.,  when  51  operates  alone  at  false 
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alaim  protwbiliiy  a,.  Then,  for  every  0 

‘I  >  h.„  i  i’  (27) 

wticre  r;  and  r*  are  the  threahoMi  of  51  for  the  optimal 
wlution  to  problem  (P2).  Funhennorc.  in  order  to  improve  the 
performance  of  the  consultation  arrangement  beyond  that  of  51 
operating  akme  at  the  same  false  alarm  probability,  the  optimal 
threshold  for  52  must  satisfy  the  inequality 

jf'’  dP(A,l«,) 

/  dP{.\,(r,)\H,)  >  -^ - .  (28) 

'  =  /'df»(A,l//,) 

If  there  are  no  (/;. <7,  (4)  such  that  (28)  is  satished  as  a  strict 
inequality.  -  0. -  tj  -  r,  ...  and  Fo  -  Fo,(r, 

Pmofi  From  F,  -  a,  it  follows  that  rj  i  i, ...  Equating 
the  false  alarm  probability  of  the  two-sensor  system  with  that  of 
51  operating  alone,  it  follows  easily  through  elementary  alge¬ 
braic  manipulations  that 

Jf*'  dF(A,l«o) 

/  dF(A,(rj)lHo)  -  - S  1  (29) 

/'dF(A,l«o) 

from  which  (27)  foUows.  Using  (21),  the  requirement  F^  > 
F0,(t, .  )  translates  to  (28)  with  some  elementary  algebra.  If  (28) 
cannot  be  satisfied  as  a  strict  inequality,  it  implies  that  for  every 
(i{,  t* )  the  ratio  on  the  RHS  of  (28)  must  always  be  oim,  since 
the  LHS  of  (28)  is  a  cumulative  probability  distribution  which  by 
assumption  is  assumed  to  be  a  continuous  function  of  the 
threshold  ij,  thus  taking  all  the  values  in  [0,  ll  This  in  turn 
implies  that  t!  ••  tT  >•  i, ...  from  which  it  follows  that  F.  •  0 
and,  hence,  Fo  -  Fo,(t,,..).  a 

C.  Numerical  Aesuits 

The  optimization  problem  (P2)  was  solved  numerically  in  the 
Gaussian  and  slow-fading  Rayleigh  channels  for  different  maxi¬ 
mum  allowable  request  rates  j9g.  Numerical  results  from  the  two 
channeb  for  fixed  team  false  alarm  probability  •>  10'^  are 
summarized  in  Figs.  5  and  6.  The  detection  probability  curves 
for  the  two  channeb  were  obtained  by  constraining  the  maxi¬ 
mum  allowable  request  probability  at  a  designated  level  and 
numerically  solving  the  optiniizatioo  problem  (P2).  On  each 
figure,  the  request  probability  envelope  (bell-shaped  curve)  indi¬ 
cates  the  maximum  optimal  consultation  rate  and  b  achieved  by 
setting  ffg  -  1.  It  b  interestmg  to  note  from  Figs.  5  and  6  that 
the  nonrandom  consultation  strategy  does  not  ahvays  use  the 
maximum  allowable  consultatioo  rate  for  the  entire  SNR  range. 
Thb  seems  to  be  counterintuitive,  siiioe  it  can  be  argued  that 
mote  (rften  consultation  can  only  improve  the  team  perfor¬ 
mance.  Thb  might  have  been  true  if  the  decision  to  consult  were 
not  associated  with  the  degree  of  confidence  of  the  primary 
sensor  on  io  preliminary  findings.  However,  in  the  nonrandom 
strategy  scenario  that  we  consider  here,  thb  b  not  the  case.  In 
our  scenario,  the  decision  to  consult  b  associated  with  the 
confidence  that  the  primary  sensor  has  on  its  data.  Furthermore, 
since  the  decision  of  the  consulting  sensor  b  taken  to  be  final 
once  consultation  takes  place,  the  initial  decision  of  the  primaty 
sensor  only  afleeb  the  threshold  of  the  secondary  sensor  (18). 
Thus,  the  maximum  consultation  rate  b  not  necmsarily  always 
equal  to  the  maximum  allowable  request  rate,  for  the  maximum 


Fig.  S.  Dctecuoii  (solid)  and  optimal  nonrandom  requnt  i dashed) 
probabdiMS  verwia  SNR  for  a  Gaiitaian  channel.  False  alarm  probability 
AO  -  0.001  and  prior  probability  FO  -  05. 


Fig.  6.  Detectioa  (solid)  and  optbaal  naanadom  request  (dashed) 
ptobabilitiet  venus  SNR  for  the  slow-ftding  Rayleigh  channel.  False 
alarm  ptobabiliiy  ^0  -  0.001  and  prior  probability  FO  -  O.j. 


consultation  rate  b  dictated  by  the  degree  of  confidence  of  the 
primary  sensor  on  ib  initial  findings  which  b  a  function  of  the 
SNR  and  the  channel  stttbtks.  From  Figs.  S  and  6.  it  is  seen 
that  the  optimal  maximum  request  probability  saturates  at  dif¬ 
ferent  leveb  for  the  two  channels.  Thb  difference  in  the  behav¬ 
ior  of  the  two  channeb  b  explained  in  [13]. 

Another  observed  difference  in  the  behavior  of  the  two  chan- 
neb  b  reflected  oo  the  vaiiatioo  of  the  maximum  optimal 
consultation  rate  with  the  prior  Fg  for  fixed  SNR  and  false 
alarm  probability.  Fig.  7.  Tlie  request  probability  P„  m  (23) 
depends  on  the  probability  masses  assodmed  with  the  indecision 
region  under  each  faypoth^  and  on  the  prior  Fg.  If  the  mequal- 
iqr  constraint  F«  $  fig  can  be  satisfied  as  a  strict  inequality  for 
any  prior  Po.  t^  the  indecision  probability  masses  (F^d,)  - 
F/tpi  and  [Fj,g(t*)  -  Fj|f(i{)]  will  remain  constant  irrespective 
of  Fg.  Thb  b  definitely  the  case  when  fig  •  1.  Hence,  for  the 
maximum  optimal  consultation  the  variation  of  P,,  with  respect 
to  Fg  b  linw  (Fig.  7).  For  the  Rayieigh  channel,  the  maximum 
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Fig.  7.  Effect  of  phor  probabdiiy  FO  on  dnceiiao  and  apcimal  raquot 
pratabiliiMS.  AO  -  0.001.  SNR  -  10.0  dB.  ChaiiMl:  Giiwiin  (C)  and 
RaytaighlX). 


optiaul  requdt  rate  is  monatonictUy  increasiiig  with  it 

is  moaotoaically  decreasing  for  the  Gaussian  channel  For  the 
Gaussian  channel  it  was  found  that  decreases  as  Pa  in¬ 
creases  irrespective  of  A)  and  (rf  SNR.  However,  for  the  Rayleigh 
channel  the  variation  at  P^u  P^  increases  depends  oa  fia 
on  SNR.  The  reason  is  that  the  slope  of  the  line  that  determines 
#**  in  (23X  that  is  [/*,(<;)  -  /»,(/;))  -  (FWd;)  -  fV(r()l  does 
not  maintain  the  same  sign  for  all  ^'s  and  SNRs.  From  Fig.  8.  it 
is  seen  that  the  slope  is  negative,  implying  a  decreasing  consulu- 
tion  rate  for  Fo  S  0.72S  but  positive,  implying  an  increasing 
slope  for  Fq  >  0.723. 

From  the  analysis  of  the  numerical  results,  it  follows  that 
despite  the  exhibited  dificrences  between  the  two  channels,  the 
optimal  solutions  possess  the  desired  properties  postulated  by 
the  design  criteria  in  (12],  Analytical  rt^ts  supporting  some  of 
the  above  qualitative  siatemena  for  channela  that  can  be  mod¬ 
eled  by  ab^utely  continuous  distrfoutioos  with  respea  to  the 
Lebesgue  measure  under  either  hypothesis  can  be  found  in  (12| 
for  the  formulation  of  problem  (P2)  and  other  formuiatioos. 

D.  Comparison  c/  Numerical  Kasulo 

In  order  to  compare  the  advantages  bom  nonrandom  consul¬ 
tation  versus  random  consultation,  the  iMnwi™  necessary  re¬ 
quest  probability  for  achieving  the  same  detectkm  probability 
with  tmtimal  nonrandom  consuitttion  m  with  optimal  random 
consultatioo  is  computed  and  ptoaed  in  Fig.  >  m  fiinctioo  of  the 
prior  probability  for  the  Rqri^  channel  sssnming  team  false 
alarm  probability  OilOl.  For  random  consultation,  the  request 
probability  is  independent  of  the  prior  Fq.  On  the  other  hand, 
the  optimal  request  rate  for  nonrimdom  consultation  increases 
lineariy  m  Fo  increases  for  Fq  >  0.723,  but  remains  substan¬ 
tially  below  the  required  request  rate  in  random  consultatioo  for 
the  tame  Fp  (compare  to  Rg.  3X  Tlius,  optimal  nonrandom 
consultatioo  results  in  substantial  reductioo  in  communication 
requirements  (consultatioo  rate)  required  to  achieve  a  certain 
team  performance  level  compared  to  random  consultation.  No¬ 
tice  that  in  7.  the  detectioo  probability  for  the  Rayleigh 
channei  is  below  0.723,  and  thus  the  request  rate  decreases  as 
the  prior  probability  increases,  in  agreement  with  the  results  in 
F%.  8.  If  a  cost  foctor  (price)  is  associated  with  the  cooimunica- 


Fig.  8.  Opomai  raqusst  rates  requited  to  achieve  specific  deteciKxi 
prababilidc*  (or  the  Rayleigh  channel  at  10.0  dB.  Slope  of  optimal 
tequen  rate  changes  sign  deynding  on  the  specified  detecnoo  probabil- 
ity. 


tion  requirements,  the  (F2)  optimization  problem  can  be  modi- 
6ed  to  aocoum  for  that  cost  [12]. 

The  nonrandom  consultation  scheme  is  compared  to  optimal  I 
and  luboptimal  consultation  schemes  for  the  request  rates  equal 
to  the  optimal  nonrandom  request  rate,  le.,  the  rate  that  corre¬ 
sponds  to  ^0  ^  I  9).  The  optimal  symmetric,  nonrandom 
consuitttion  scheme  for  g,  -  1,  le^  the  serial  combination  512. 
is  also  included  in  the  figure.  The  following  are  observed:  a)  the 
optimal  nooaymmetric,  nonrandom  consultation  scheme  per-  . 
forms  veiy  do^  (identically  in  the  case  of  the  Rayleigh  chan¬ 
nel)  to  the  aptim^  symmetric,  nonrandom  consultation  scheme, 
le.,  the  serial  combination  512;  b)  the  performance  of  the 
optimal  and  suhoptimal  random  consultation  schemes  is  inferior 
to  the  nonrandom  consultation  at  the  optimal  request  rate:  anc 
c)  the  suboptimal  random  consuitttion  scheme  performs  worse 
than  the  optimal  random  consultation  for  the  Gaussian  channel  R 
but  identicrily  to  it  for  the  Rayleigh  channel. 
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Conclusions 

Random  and  nonrandom  comultatioa  schemes  arc  esamincd 
and  different  mathematical  foraiiilations  of  the  decision  matim 
piobtem  in  the  presence  of  consultation  cost  are  analyzed.  The 
problem  of  coopting  senson  is  cast  in  a  general  firameworfc 
suggested  for  sensor  integration  that  satisfies  design  criteria  that 
guarantee  the  benefltt  of  dau  fusion  (11].  The  analysis  and  the 
numerical  results  indicate  that  the  optimal  solutions  to  the 
different  schemes  introduced  in  this  note  satisfy  the  three  dau 
fusion  design  critena  which  we  advocate  to  be  essential  for  the 
design  of  any  practical  decision  making  system,  namely  mono- 
tonicity  with  respect  to  fused  information,  monoioniciiy  with 
respect  to  the  cost  associated  with  squiring  the  information,  and 
rodustntss  with  respect  to  a  prion  uncertainty.  Comparison  be¬ 
tween  the  random  and  nonrandom  consulution  schemes  demon¬ 
strates  that  nonrandom  consulution  considerably  reduces  the 
communication  requirements  for  achieving  a  desired  perfor¬ 
mance  level  compared  to  the  communication  requirements  for 
achieving  the  same  performance  level  with  random  consulution. 
Additional  analytic^  and  numerical  results  from  different  for¬ 
mulations  of  the  problem  can  be  found  in  (121- 

ArPEMitx 

To  derive  the  ROC  of  the  serial  combination  of  51  and  52, 
we  consider  a  system  of  two  sensors  51  and  52  in  which  the 
decision  of  sensor  52  is  transmined  to  sensor  51  and  is  then 
used  together  with  the  raw  dau  Z|  available  to  51  to  arrive  at  a 
final  decision  u,.  To  that  enent,  we  follow  an  analysis  similar  to 
(6).  Denoting  the  distribution  of  r,  as  p(r,|//a)  and  p(r,|/f,), 
the  likelihood  ratio  at  sensor  51  becomes 

f-(r..u;ltf,)  p(riltf,)(Po^  a(i4i  -  1)  -f  (1  -  Pot)  a(u,)l 
“  p(r,l«o)(/*irj«(‘‘2  -  1)  +  (1 

(A.1) 


where 


Po2  -  «•(“:  -  and  -  H^o)  (A.2) 

are  the  detection  and  false  alarm  probabilities  at  52,  respec¬ 
tively,  II]  -  k  implies  that  sensor  52  decides  //«,  k  •  0, 1,  and 
fi(x)  is  Kronecker’s  deiu 

;:S 

Hence,  if  r  is  the  threshold  at  sensor  51,  the  test  at  51 
reduces  to 


p(r,lffo)P,2 


if  uj-  1 


p(r,|ff,)(l  -  Pq,)  H. 

p(r,|//o)(l  -f»«)  v/ 


if  iij  ■  0. 


(A.3) 


Alternatively 


where 


'll 

A(rO  %  '•* 


if  U]  -  1 
if  U]  •  0 


A(r,) 


P(»-.IW.) 


(A.4) 


lid  ~  ^01 )  ^ 
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masses  arc  allocated  to  the  dUTerent  alternatives  (dectstonsl  by 


'O^BLE  I  1ABLE  II 

NiMiber  <X  Moaotooe  lacRajuif  FuaaioM  And  PeiccnU|e  of  Tbul  Nuabcr  Of  Biaeuom  Seattbed  For  The  Set  Of  Optinul 

RaduOKm  Thmhotd* 


Number  of 
Senaon  N 

Number  of 
Monotone 
Functions 

Number  of  all 
Poesibte  2^  Functions 

Percentage 

Reduction 

Number  of 
Senaon  N 

Ln  (ia  number 
of  Monotone 
Funaiona  -2) 

Tbtai  Number  of 
Functions  E/y 

Percentage 

Reduction 

1 

3 

4 

25 

1 

1 

1 

000 

2 

6 

16 

62.5 

2 

4 

2 

5000 

3 

20 

256 

92.19 

3 

18 

9 

5000 

4 

I«8 

65.536 

99  74 

4 

166 

114 

31  13 

5 

7.581 

4.2949673  x  lOP 

99  99982 

5 

7379 

6,894 

903 

6 

7.8a.354 

1.8446744  X  10'* 

100 

6 

7E28JS2 

7,786.338 

054 

probabiijty  of  detection  at  the  fusion  for  the  given 
false  alarm  probability.  Let  the  best  randomized 
N-P  test  at  the  fusion  center  be  t(ui,...,u.v)  Aq 
w.p.  j?,  resulting  in  false  alarm  probability  Pf^,.  and 

i(ui . u/n)  Ao  w.p.  1  -  p,  resulting  in  false  alarm 

probability  fin-  The  thresholds  Ao  and  Aq  are  chosen  so 
that  the  total  false  alarm  at  the  fusion 

Pf,  ■  pPf,  +  (1  -  P)^F,  ■  ®o.  (5) 

Thus,  the  corresponding  detection  probability  at  the 
fusion 

^Do  ■  P^O,  +  (1  -  P)f*Df  (6) 

Since  the  probability  p  is  fixed  from  the  constraint 
(5),  the  detection  probability  in  (6)  is  maximized 
if  each  one  of  the  Po,  and  ^D»  is  maximized. 

But.  according  to  the  part  of  the  proof  in  the 
nonrandomized  N-P  test  above,  each  one  of  these  two 
detection  probabilities  is  maximized  if  an  L-R  test  is 
used  at  the  sensors.  Hence,  the  Theorem  is  also  proven 
for  the  randomized  N-P/L-R  test 

A  precise  characterization  of  the  set  of  fusion 
functions  that  satisfy  Theorem  1,  indicated  as  Ry  in 
Ihble  II,  can  be  found  in  (12|. 

III.  CONCLUSIONS 

A  general  proof  that  the  optimal  fusion  rule  for 
the  distributed  detection  problem  of  Fig.  1  involves 
an  N-P  test  (or  a  randomized  N-P  test)  at  the  fusion 
and  L-R  tests  at  all  sensors  has  been  provided.  The 
proof  does  not  suffer  from  the  weaknesses  of  the 
Lagrange-multipliers-based  proof  in  (lOj. 
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a  K.  BOCCOtJUAS 
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partition  oC  the  LR  by  the  decision  rule  is  shown  in  Fig.  4.  It 

corresponds  to  a  standard  binary  hypothesis  -  binary  decision  Bayesian  ftg,  4  Case  1 .2  The  Indedalon  region  Is  compteldy  eliminated 

problem. 
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rraal  We  prove  the  theorem  for  the  case  of  two  sensors  and 
binary  hypotheses  testing.  A  generalization  of  the  proof,  although 
nolalionally  involved,  does  not  presenl  any  conceptual  dllTiculues  and 
as  such  Is  omitted 
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condiliimcd  on  each  hypcilhesis,  A  -  1 , 2, ....  \.  Given  a  dcMrcd  level  id 
prohabilily  of  false  alarm  at  the  fusion  center,  P,  >  a„.  we  seek  the 
itptimal  test  that  maximues  the  probability  of  detection  P„  (<>i  minimizes 
the  probability  t>f  miss  -  I  -  P„__).  In  our  case,  the  P,^  and  P^^  arc 
given  by 
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SteUos  C.  A. 

Decision  and  Control  Systems  Laboratoiy 
Depormient  of  Ekcotcsl  and  Computer  Rn^nertins 
THe  Pennayhmnis  State  Univeraiqr 
Unlweieiq^  Park.  PA  16803 
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8fceUaet 

Ototrtbuled  Dedslon  (Evidence  Puaton  (DO(E)F)  eehlUtt  some  Intereaonf  charactensUcs  which  are  not  present 
in  oentnllaed.  or  iww  data,  fusion.  Tlie  tntcresttnd  chametenades  iclate  to  the  semantic  Information  that  the 
dcdslona  (In  the  btoader  sense  of  the  teim)  oomey  which  (scmanoe  information)  Is  net  pieaent.  at  kast  explicitly,  when 
raw  dau  Is  fused.  Olflerent  theortea  and  results  lelated  to  DO(E)P  hove  appeared  In  the  bterature.  Each  theory  takes  a 
different  stand  on  the  dcSniaon  of  how  to  measure  evidenM  or  eotetete  dedstons.  The  obfecihc  of  this  paper  u  to 
tnvcsUiaie  the  nature  of  00(SP  and  establish  a  oomparmthe  basis  bet—en  the  two  moot  promlncm  theories  In  DO(E)F. 
nanwly  the  Bayesian  and  Dempeter-Shafer  theortea.  Tb  that  extent,  the  tlmllailllee  and  differences  between  the  two 
theortea  that  result  from  the  semantic  differences  in  the  format  of  the  fused  Infbrmadon  are  uiwestlEaied.  A  perfbrmanoe 
oompartaon  between  the  two  theortea  is  attempted.  A  Ceneialtaed  Evtdenee  Proceeslns  (CEM  theoiy  that  extends  the 
Bays  elan  approach  Into  fumy  dedstonmaklni  IS  used  to  compare  the  perfermanceofaBeyestan  soft  decision  making  ystem 
with  that  ofa  hard  dedalon  making  Bayesian  tyatem.  The  atmllantles  and  dtSeienees  between  the  CEP  combining  rule  and 
the  Oempeter's  combining  rule  arc  dtecueeed  and  a  consistency  oompartaon  betewen  the  two  rules  is  performed. 

1.  mertbteted  Dedalen  Ptmlen  end  EiHeaae  Ftweeealng 

Distributed  Decision  (Evidenoe)  Fusion  (or  OD(E)P  In  the  sequel)  exhibits  some  inteiesting  chaiactenatlca  which 
are  not  present  In  eentrallMd.  or  rew  data,  fusion.  The  Interesting  eharactertstlca  relate  to  the  semantic  information 
that  the  dedstons  (In  the  broader  sense  of  the  term)  convey  which  is  not  piraent  at  least  expbdtly.  when  raw  dau  is 
fused.  Different  theortea  and  resulu  lelated  to  Distributed  Dedalon  FIimm  (DDF)  hawe  appeared  in  the  literature  the 
last  decade  ffeSa  9 1 .  SadJ  ‘86.  ChVa  '86.  Srtn  86.  TVB  '87.  VTT  '88.  TVB  ‘88.  Oemp  ‘68.  Shaf  76.  -morn  ‘90|.  Each  theoiy 
lakes  a  dUfeient  stand  on  the  definition  on  how  to  measure  evidenoe  or  combine  dedstons.  The  objective  of  this  paper  is 
to  Investigate  the  nature  of  DD(E)P.  present  some  of  the  domiiutlng  theories  on  DDF  and  DEF.  higlillght  slrailanties  and 
differences  among  them  that  result  from  the  semantic  format  of  the  fuaad  mfbematton.  and  exploit  natural  lopologica) 
equivalences  between  DDF  and  structuiea  that  exhibit  kaming  ablltties.  such  aa  neural  netamrks. 

To  avoid  concealing  some  of  the  issues  under  structural  eecnplexlties  and  ksep  the  disetisston  focused  and  as 
clear  aa  poaaibie  we  oonatder  the  simplest,  yet  fundamental.  DDF  topotogy  and  peeblem.  We  aasume  a  parallel  topology  in 
which  each  eenacr  receives  data  from  a  common  volume.  Fig  1.  Punhermore.  we  aooume  that  the  eenaore  are  perfectly 
aligned,  so  the  problem  of  ratsmateh  does  not  arlae  (ThOk  ‘881.  in  this  parallel  topology  we  assume  the  simplest  DOF 
piobkm  with  each  sensor's  data  autfsOeally  Independent  from  the  other  senaora.  Each  sensor  performs  a  local  operation 
on  Its  dau  and  tranamlta  the  outcame  to  the  fiidon.  the  liiston  ooliecu  all  the  local  Infonnatton  from  the  sensors  and 
produces  the  global  Infeicnoe.  Several  opttmallQr  reaulte  on  Bayealan  DOF  have  boon  obtained  the  recent  years  (TVB  891. 
IChVa  '86|.  rrVB  ‘871.  (VAT  '891.  nhom  ‘901.  (Ttl  ‘90).  Under  tlie  aasupitons  euted  above,  tlie  optimal  Bayealan  DDF  is 
shown  In  Fig  X  In  this  paper  we  ooiwider  mulo-levcl  logic  dedslon  rules,  in  sdiich  the  number  of  permissible  local 
dcctatona  the  number  of  tested  hypotheaes.  Dedaton  ruiea  for  btnaty.  aa  tmU  aa  multiple  hy^theees.  testing 

probkMS  aie  considered. 

In  DOF.  the  outcome  of  the  global  prooeaaing  (fiialon)  deperxis  on  the  outcome  of  the  local  dau  processing 
(sensor  level)  and  the  eemanac  format  of  the  fused  Informoiton.  In  the  Bayesian  oonicxL  t)ie  outcome  of  the  local 
processing  mui  be  either  hard  dedstons  In  a  single-level  logic  (Thom  '90).  or  soft  dedalons  In  a  multt-jevel  logic  (Thom 
'901.  or  It  can  be  tlie  outeooM  ofa  simple  quanttsaiton  of  the  data,  if  no  semantic  attrlbutea  are  atuched  to  the  outcome 
of  tlie  focal  prooesaing  (LLC  ‘90|.  In  the  context  of  the  Oempeter-Shafet'e  (D-S)  theoiy.  the  outcome  of  the  local 
processing  is  a  act  of  probabilities  that  relate  to  tlie  degree  of  aupport  for  each  proposltton  In  Die  frame  of  discernment 
bythethedauofeachbcafproceieorlDetpp'BB.  3haf76(.  Thus,  the  focal  processing  outemne  of  a  Bayesian  DDF  is  a 
quantlxed  scalar  number,  whneas  the  outcome  of  the  D-S  focal  processor  Is  a  real-valued  vector  that  corresponds  to  an 
entile  probaliilily  dlscnbutton. 

In  wldltton  to  semantic  differences  In  the  output  of  tlte  focal  processors,  tliere  are  also  substantial 
differences  In  tlte  communlcatton  requiiemenu  for  trensmitang  the  local  infbrmatton  to  the  fiiaton.  Even  in  the  presence 
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of  multt-bwl  io^  th*  aonvminicatton  roquuementt  for  vanarnttUnf  one  out  of.  My.  M  Integers  is  substsnOsUy  lower 
than  mnsoriUtng  an  M-diinenslonal  ical-valued  vector.  Hence,  the  oommunicatlon  lequlicments  for  the  Bayesian  OOF  are 
substantial^  taswr  than  the  lequMments  of  D-S  OEF  for  the  same  niunfaer  of  data.  Thus,  a  meaningful  oompariaon  between 
Bayesian  and  D-S  DOF  should  either  Bx  the  available  oommunleaiton  bandsrtdth  to  be  the  same  hr  both  approaches,  or  fie 
the  fusion  objectives  to  be  eommon  and  study  the  communication  overhead.  In  this  paper  we  attempt  a  oompariaon  of  the  0- 
S  DDF  with  the  Bayesian  OOF  assuming  ttenUcal  communlcaaon  lequiiemenu. 

Several  optimally  results  on  Bayesian  OOF  have  been  obtained  the  recent  years  [TVB  '891.  IChVa  '86|.  ITVB  '87|. 
[VAT  '891.  rthom  '901.  (TSI  '901.  In  (Thom  '88  and  Thom  '90|  a  Cenerahaed  Evldenoe  Processing  (CEP)  theory  was 
Introduced.  The  theory  generalises  the  B^eslan  OOF  Into  a  framework  where  soft  decision  making  Is  allowed.  The  CEP 
theory  la  bneily  sulmmanaed  In  the  next  section.  For  a  complete  descripoon  of  the  CEP  theory,  see  [Thom  90  and  Thom 
90|. 

a.  . . allnd  mdinni  naiieiiliig  Thinii 

The  pivoting  Idea  behind  CEP  theory  la  the  separation  of  hypotheses  horn  decisions.  Once  this  separation  ts 
understood,  the  Bayesian  for  N-P)  DOF  theory  can  be  extended  to  a  baine  of  discernment  similar  to  that  of  D-S  theory.  In 
the  context  of  CEP  theory,  the  choice  of  dllTerent  dedstoru  can  be  thought  off  as  dlfierent  quantization  leveb  of  the 
data.  For  notatlonal  simplicity,  the  CEP  theory  Is  Srst  presented  for  binary  hypothesis  declatcn  fusion.  Cenerallzatlon 
to  multiple  hypotheses  decision  fusion  follows  at  the  end  of  the  section.  Let  H,  .  Hi  be  the  two  hypotheses  uixler  test. 
The  probability  space  la  partitioned  into  two  re^ona  according  to  the  ewnis  la  •  H, )  and  la  •  H.  I  with  associated 
probabiUtiea  Pj  *  0  and  Pq  <  0  respectively,  where  P^  *  P^  ■  t.  Let  <^.  d^ .  arid  ;■  be  a  bame  of  discernment 

used  by  a  deOalon  maker  to  partition  the  probability  space  according  to  the  gathered  evidettce.  where  the  three  dedsloru 
cortesporal  to  the  propoaltions  true."  '71^  true.*  and  7^  or  true.'  respectively.  The  decision  ^  :■  .  where 

'v*  stands  for  'or.'  Indicates  the  Inablliqr  of  the  deetston  maker  to  coriK  up  with  conclusive  evidence  on  the  true  native 
of  the  hypotheals. 

In  the  >i«««r#»i  prnbahlllatif  (BsyeaianJ  frameamrk.  the  probablliqr  assortaird  with  ^  ^  la  equal  to 

PiWo^jl-PiWo^Hjl-PrlHoUPriHjl- I  tt.l) 

since  H^  and  H  ^  ecnalltule  a  disjoint  omerage  of  the  probablilty  space  oror  which  the  evidence  processing  problem  is 

defined.  Asltwaamentlanedcarller.  the  appaientsmakneiaofthe  Bayesian  theory  to  Incorporate  non-mutually  exclusive. 
Le.  icdundaaL  propoaittons  gave  riae  to  the  D-S  theoiy  which  Is  particularly  eBetent  in  dealing  with  fuzay 
propoaitlona.  However,  by  dlMsaortatliigdeclatona  from hypotheaca.aunlfledlriimevrork Is  created  which  can  accommodate 
both  Bayesian  and  0-S  DOFs. 

la  the  context  of  CEP  theoiy.  the  basic  probability  sMl^imcnt  (bprd  Is  accomplished  either  by  minimizing  a 
gencraliaed  Bayesian  ilak  IThom  '891.  or  through  any  method  that  ts  applicable  to  D-S  thiiory  (Thom  '90|.  If  the  objective 
at  (he  fuMhn  ic  to  nanimiae  a  generalised  Bayesian  risk,  evidence  ccmbiniiig  In  the  GEP  throry  Is  done  using  likelihood 
raiio  fiinctioiu  and  pairwise  mulupllcsilon  of  probabilities  according  to  the  way  described  In  'Table  I  and  Eq.  (2.2).  The 
CEP  combining  rule  involves  pairwise  mulapUmuion  of  probadifilty  maaeee  according  to  Table  I  as  In  D-S  theory.  However. 
In  OBP  theory,  the  mas  see  are  asaodaied  via  threshold  In  an  optimal  eay  so  that  a  certain  risk  is  minimized,  or  so  that 
the  probability  of  detecUon  la  maxlmlsnd  for  fixed  Use  alarm  and  Indecision  probabilities  (generalized  Neyman-Pearson 
test),  whereas  In  D-S  theory  the  probability  masses  (beliefs)  are  eombinsd  accoiding  to  Intersection  of  events,  resulting 
m  evidenoe  conflict  (Eq.  3.6).  For  a  numerical  study  of  the  effect  of  the  decision  cost  on  the  scIscUon  of  the  bpa  and 
the  perfbnaanoe  of  the  CEP  OOF  rule  tee  IGahi  '90|. 

Table  1  QgEsldenca  Coiahialag  Kale  O  hupothmm.  3  deetofena) 


S3 

SI 

n^td.) 

mj(d. ) 

mj(d.  1  ii^(d.  I 

mj(d.I  B^Cd,  1 

m^(d.  1  m^(d.  1 

m'jfd, ) 

mj(d.)ra^(d.) 

mj(d.)n4(d.) 

iE^(d. )  m^(d, ) 

m'jld, ) 

mj(d,  I  rn^ld.  I 

mj(d,)m^(d, ) 

m^(d,  );ii^(d,  I 

The  probabilides  tn  Table  I  ate  conditioned  on  each  hypothesis  i.  I  >  0,  1.  Thus,  each  nj  .  J  ■  1.  2.  in  Table 
I  Is  a  conditional  probability  for  1  •  0.  1.  Hence,  the  initial  probability  ocmbtning  takes  place  among  condiuonal 
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probabUiaes  only.  Fbr  t  ■  0.  1,  each  pfoduct  tenn  tn  Tabk  i.  la  a  prababUlty  maaa  on  the  UTT  aoocduiaia  -■«-  with 

abactaaan^^ld)  /  forevny  d  Evidence  oomblnlng  under  each  hypocheala  ladone  firomlkble  I  by  aununlng 

the  proba^tka  ban  Thbte  I  whoae  abadaaae  bll  tn  apeci&e  Iniervala  tpeelfled  cither  by  an  optuntcattoa  enterton. 
or  a  certain  dealiad  pcrlbrmance.  Hence,  (or  d  ■  d^ .  ^ . evidence  oomblntnc  under  each  hypotheala  .  l  a 

0.  1.  U  done  acoordinc  to  the  ihiaahaU  fala 

,  ,  m|(dj  nr4(d  J 

m'j(d^^)tn^(dm)  -  dedalont^  tf  *  _  J 


J 


(2.2) 


where  la  the  declalon  regton  that  (avora  decision  ^ .  The  regions  ^  may  be  detennined  ao  that  a  performance  enterton 

ta  opOinlaed  at  the  iiiaton  (and  possibly  at  the  •'^ruors).  For  a  alngie  blnaiy  hypothesis,  the  decision  regtoiu  ai  the 
fusion  are  determined  Uy  simple  thresholds,  tn  which  case  the  dedaton  rule  (2.23)  slmpbUra  to 


m!(dM  m‘(d  ) 

®l‘V®2W«'  -  ^  •,  <  0,/,  0, .  , 


(2.3) 


(or  all  k.  m.  and  J.  where  ^  are  the  thresholds  of  the  Lin's  assortaird  with  the  dllleient  declslona  that  minimize  some 

risk  furMtlon.  If  maWpIs  hjrpothaaaa  (more  than  two)  are  tested,  the  combining  rule  la  extended  to  combine  the  belief 
funcOoru  of  the  individual  sources  at  the  fusion  and  generate  the  new  conditional  brttrf  hincOon  under  each  hypothesis. 
The  asaodadon  of  the  new  behef  function  at  the  fusion  with  the  set  of  admlaatble  declaiona  must  be  done  by  using  the 
muldple-hypotheaes  LRTfVTbe  '68|.  or  another  test  that  optimizes  some  performance  measure.  It  must  again  Ik  underlined 
that  the  probabiUlies  In  the  CEP  combining  rule  need  not  be  defined  through  Bayrilan  reasoning  but  may  very  well 
ooneapond  to  belief  functlona  resulting  from  the  0*S  approach. 

In  the  multiple  hypotheses  case,  the  oondldotial  belief  function  In  CEP  becomes  a  mulU-varlable  function  of  the 
J  dP(d,IH^ 

Lfis  (A^(d)  :m  n  '  *  .  k  ■  I,  2 .  m-l)  where  J  is  the  number  of  sensors  In  the  fusion  ayelem.  ^  the 


J  -  1  dPfdjIH^^ 

dedaton  of  the]-th  tensor,  and  m  the  nunfoer  of  tested  hypothei 
by  forming  the  Joint  probabllliy  dlauibutlan  of  the  Lffs  under  each  hypothesis.  Le.  by  generating  dP(A. ,  a. . ^  I 

H|p.  ka  1.  Z  ....  J.  For  two  sensors  srtth  independent  dedalons  oondtlloned  on  each  hypotheala.  the  conditional  evidence 

combining  rule  of  CEP  for  three  hypotheala  and  soft  decisions  (fuzzy  logic),  can  be  implemented  using  Table  II. 


The  evidence  from  the  d'J' :rcnt  sensors  is  combined 

A  , 
m-l 


Table  n  C«M< 


tele  fat  meMptabypaibasaata  OCT  tbaasy 

dPfo.  (d. .  d, ),  A.  (d, .  d, )  I  H^) 

-  dP(A.  (d. .  d.  1 1  H^ldPlA.  (d.  .  d. )  1 H^) 
2 


(d.  .d.) 

A,  (d.  .  d. ) 

A,  (d,  .  d, ) 

-n  dP(M«^)IH^)dPtA.(d|)IH^) 

(0.  01 

A,  (0.0) 

A.  (0.0) 

dP(A.  (0.0)IH^)dP(A.  (0.0)  IH^) 

• 

(0.  1) 

A.  (0.1) 

A.  (0.1) 

dPlA.  (0.1)1  H^)dP(A.  (0.1)1  Hj^) 

(0.2) 

A.  (0.2) 

A.  (0.2) 

dP(A.  10.2)IH^)dP(A.  (0.2)11^) 

(0.  Ovl) 

A.  (O.Ovl) 

A.  (O.Ovl) 

dP(A.  (O.Ovl) IH^)dP(A.  (O.OvDIhj^) 

(0.  0v2) 

A.  (0.0v3) 

A.  (0.0v2) 

dPlA.  (0.0v2)IHj^)dP|A.  (0.0v2)II^) 

(Ovl.  0) 

A,  (OvI.O.) 

A.  (OvI.O.) 

dP(A.  (0vl.0)IH^)dP(A.  (OvI.O) IH^) 

» 

(Ovl.  1) 

A,  (Ovl.l) 

A.  (Ovl.l) 

dP(A.  (Ovl.l)IH^)dP(A,  (Ovl.l)ll^) 

(Ovl.  2) 

A,  (Ovl.2) 

A.  (Ovl.2) 

dPtA.  (0vl.2)IH^)dP(A.  (Ovl.2)ll^) 

(Ovl.  Ovl) 

A.  (Ovl.Ovl) 

A.  (Ovl.Ovl) 

dPU  lOvl.Ovl)IH|^)dP(A,  (Ovl.Ovl)ll^) 

(Ovl.  0v3) 

A.  (0vl.0v2) 

A.  (0vl.0v2) 

dP(A.  (0vl.0va)lHj^)dP(A.  (0»1.0v2)IH,j) 

i 

(0v2.  1) 

A.  (0v2. 1) 

A,(0v2.1) 

dP(A.  (0v2.I)IH^)dPtA.  (0v2.1)II^) 

ML. 
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Onee  all  the  enutea  (n  Thbie  U  ai«  eiuered.  the  cvtdenoe  la  combined  by  adding  the  probabiliues  (ram  the 
fourth  column  together  adien  ihe  oonaapondlng  abaclaaae.  Le.  the  paira  (a.  (d,  .d. ).  a.  Id.  .d, ))  in  the  lecond  and  thud 
oolumna.  aia  tdenOcaL  Once  the  evidence  (rom  all  aenaora  la  combined  uaing  tabica  almllar  to  Table  II.  decisions  are 
with  the  eotablnad  evidence  uatng  rule  (3.33)  ao  that  a  dealred  performance  crtterton  la  opumlzcd. 

Thtia.  evidence  combining  at  the  hialon  la  done  conditioned  on  each  hypotheala  aeparaiely.  The  evidence  la  then 
aaaociated  anth  the  arindaelblr  decfolona  unconditionally  uaing  a  LRT  or  a  teat  that  opOmiaea  aome  performance  measure 
NoUce  that  the  aet  of  declaiona  need  not  be  the  aame  aa  the  aetofhypotheaea.  Thua.  evidaatia  eoaddniag  and  dedaioa 
making  aia  nndatataad  aa  eapaeate  eonoapte  In  the  fraaemark  of  the  OanaraBied  Bvldanea  r*^T^r*-g  Theofy 

The  geneimllsntlon  of  the  Bayealan  land  N-P)  theoiy  by  the  CEP  theoiy  la  atralghtforwaid.  An  mierpreiatlon  is 
probably  required  to  establish  the  coneapondence  between  CEP  and  D-S  theorlea.  If  the  probabilities  P(t^  •  l  I  H  )  .  i  • 

1.  2.  3.  aro  considered  as  (conditional)  bpa's  (basic  probability  assignments  (Shaf  ‘6D  in  the  D-S  theory  for  the  k-ih 
sensor,  k  •  1.  3 . N.  under  hypothesis  H|.  J  •  0.  1.  the  evidence  from  the  dlflerent  sensors  at  the  fusion  is  combined 

using  the  conditional  distribution  of  the  LR  under  the  dlflerent  hypothesis  according  to  Table  I  or  II.  A  new 
(condlUonalt  belief  function  la  generated  using  the  decision  thresholds  st  the  fusion.  The  lhard)  decisions  at  the 
sensors  are  used  to  simply  produce  a  hard  decision  st  the  fusion.  If  needed,  according  to  some  opumality  catena.  In 
that  respect  the  OV  theory  not  only  defines  and  processes  the  evidence  according  to  an  a-pnon  set  of  opumality 
criteria,  but  also  providea.  If  needed,  for  opUmtted  haid  declaiona  both  at  the  local  (aenaor)  as  well  as  global  (fusion) 
level  a  capability  which  la  not  bullt-ln  the  D-S  theory  (lee  Section  3). 

The  dedalon  boundarlea  in  CEP  theoiy  determine  how  evidence  is  asaodaied  with  propositions  at  the  fusion  and 
reflect  the  choice  of  the  costa  wj^.  To  demoiutrate  the  ellect  that  the  aeinanUc  content  of  the  local  decisions  has  on 

the  global  decision  (hialon).  several  expeiimenta  were  conducted  in  Cauaalan  and  alow-fading  Rayleigh  channels.  The 
folloiving  statlatlcaJ  model  acre  asaumed  for  the  tao  channeb. 

nniiaafiin  Obaeruodon  model  at  each  sensor,  r  •  CIO.  1)  :  H. .  sisd  r  •  C(s.  1)  :  H.  .  where  G(a.p)  designates  an 
a  mean  and  variance  g  Cauaalan  dbtrlbutlorL  If  P^  la  the  opentlng  falac  alarm  probabtllty.  the  assoctaied  threshold  ^ 

:■  0  ^(Pfl.  where  01  )  ■  1  -  0|  )  b  the  cumubtlve  dbtnhutlon  function  (edfi  of  the  standard  rurrmal.  and  Q*  is  us 


inverse. 

fhbe olonn profadbguy;  P^  <•  tll(l'»c)l  :  Dmactlan prahabtUty:  ^ 

where  X  Is  the  threshold  used,  snd  c  the  SNR  st  the  sensor.  In  the  stngb-level  local  logic  Bayesian  DDF  with  bard 
decbfons  at  the  sensors  and  fusion,  the  probablUUea  at  the  tensors  were  generated  aaaumlng  fixed  false  alarm 
probabilioea  at  the  sensors  equal  to  O.OS.  For  the  muld-level  local  logic  OOF.  the  ambiguous  (soft  or  'fuzzy') 
decbtorui  were  generated  by  considering  a  x20%  unocrialnQr  region  about  the  thresholds  that  determine  the  decision 
boundaries  In  the  Bayesian  case,  the  numerical  results  that  are  presented  refer  to  the  binary  hypothesis  (esung  from 
which  the  set  of  "soft'  dedstons  consbts  of  Id.  a  H. .  d,  a  H, .  d.  a  H.  vH,  ).  Additional  resulb  for  ternary  hypothesis 
testing  and  arbluaiy  probablliqr  assigiunents  can  be  found  In  ICalu  '90|. 

In  a  set  of  expertmenb.  the  performance  of  Bsyesbn  DDF  (I.e.  CEB  with  soft  dedslotu  at  the  focal  level  and 
hard  dedslona  at  the  fusion  wu  compared  to  Bayesian  DDF  with  hard  deebtoru  both  locally  and  at  the  fusion.  Using  the 
’s20fo  uncertainly  legfon*  described  above  to  generate  the  soft  decision  Tf,  or  H> .'  the  Leud  Of  Confidence  (LOO.  which 
b  equivalent  to  tte  (uncondidoiiaO  probabillh'  of  correct  decblon.  was  used  for  comparison.  The  UX:  curves  m  Fig.  3 
Indicate  that  CEP  outpei  forma  Bayealan  DDF  with  hard  focal  decbtoib  m  all  cases.  The  curves  were  obtamed  by  assuming  a 
fixed  false  alarm  probabtlify  0.06  at  the  sensors  and  0.005  at  the  fualon.05.  CEP  outperfoniu  hard-decIsion  Bayesian  DFF 
In  both  binary  and  teraaiy  hypotheab  testing,  m  both  Causstan  and  slow-fading  Rayleigh  channeb  atxl  for  any  number  of 
sensors.  Thb  does  not  come  as  a  surprlae  if  the  decblon  aet  of  CEP  b  thought  of  as  the  result  of  mulu-ievel 
quantization  of  the  data,  and  Ihe  quantization  b  done  according  to  a  semanUGally  Intuitive  fashion. 

S.  DIaUtttilad  Dadstdu  Fnaloa  aatag  Peipst>T-6hatei*s  Tfcaoiy 

The  dllferenoe  between  the  Bayesian  and  D-S  theoiy  lies  on  the  qrpe  of  Information  that  each  sensor  cransmib  to 
the  fusion  after  processing  the  data  locaUy-  As  It  will  beooiiK  clear  In  the  sequel  If  the  propositions  in  the  D-S 
theoiy  are  tdentlfted  with  decMoib  In  the  CEP  fCeneialized  Bayesian)  theory,  tiien  there  are  no  senantic  diflerences  m 
the  frame  of  dbeemment  bctiroen  the  two  theories.  The  difleience  lies  on  that  tiie  probability  assignment  m  CEP  sail 
satisfies  the  El^eslan  rule,  whereas  the  evidence  assignment  does  not.  Assuming  that  tlie  number  of  hypotheses  that  are 
tested  b  fixed  and  the  number  of  declaiorb  (or  bame  of  discemment  in  the  D-S  tennlnofo^l  b  fixed,  the  output  of  the 
local  dab  proeeaaing  b  a  set  of  probabllldea  regarding  tlie  likelUiood  that  tlie  dab  have  been  generated  by  one  of  the 
particular  hypotheses  or  subset  of  hypotheses  according  to  tlie  bame  of  dbeemment.  To  that  extend,  the  use  of  the  term 
declaions  In  the  0-3  tlieory  does  not  precisely  reflect  t)ie  output  of  the  local  processing.  It  b  more  appropriate  to 
cliaracbtlae  the  outooine  of  the  local  processing  as  evidence  about  a  chosen  set  of  proposition  rather  than  decision 
regarding  a  specific  hypotliesb  or  set  of  hypothoes.  Thus,  even  If  the  frame  of  discemment  b  kept  common  between 
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Baycatan  and  D-S  approachM  (by  utUutnc  multt-kvci  Bayealan  logic),  the  mapping  of  the  dau  tn  the  output  of  the  local 
proeeaaor  ta  completely  dlfSmenC  the  Bqwalan  proceaaor  mapa  each  dau  u  a  particular,  atngk  dcctaton  (Integer-valued 
acalar).  wfacreaa  the  0-S  prooeaaor  mapa  the  aame  dau  u  a  act  of  pmbabliltlea  (mulBdlmenalonal  real-valued  vector) 
aaaocUUd  twithalldeclalonatn  the  frame ^dlaoemmenL  Hence,  the  oomraunlcatlonrequliaiaenu  between  B^ealan  and  0-S 
prooeaaora  and  fuaton  are  dUhrent.  Aaaumlng  a  fraoK  of  dlaoemment  oonaUOng  of  k  propoatdona.  the  eommunlcadon 
lequliemenu  for  the  Bayealan  caac  la  2logk  (the  bandwidth  retpilied  to  iianamit  one  of  k  biu).  wheteaa  for  the  O-S 
pnoeaaor  k  analog  outpuU  muat  be  Uanamlttad  to  the  fualon.  Thua.  unleaa  the  commumcaaon  requlremerua  for  the  two 
approachea  are  made  eommon.  no  direct  oompartaon  In  the  perferraanoe  of  the  two  achemea  la  meamngluL  Sinm  auch  a 
performance  Is  beyond  the  objecttvea  of  this  paper,  we  limit  the  diacuaalon  In  the  structure  of  the  O-S  OOF. 

In  O-S  theoiy.  a  set  of  mutually  exclusive  and  exhaustive  propoaltlona  . <s  assumed  toward  which 

evidence  Is  being  ofleied.  To  each  proposition,  their  disjunctions,  and  ne^ufons.  a  nonnegatlvc  number  between  aero  and 
one  (or  probability  inaas)  Is  assigned.  If  A  la  an  atomic  proposition,  a  disjunction  of  propositions,  or  a  negation  of  a 
proposition,  then  a  probability  mass.  nUA).  Is  assigned  to  A.  The  quanUty  m(A)  la  a  measure  of  the  belief  In  proposition 
A  based  on  the  evidence  offered.  If  U  deslyutea  the  fiame  of  dlsoernment.  then 

Z  m(A)  <  1  (3.1) 

ArU 

with  the  remaining  1  -  Z  m  (A)  mass  attribute  to  Ignorance.  Assuming  that  Ignorance  consatutes  a  separate  proposition 
AcU 

and  extending  the  set  U  to  Include  this  proposition,  expression  (3. 1)  holds  as  an  equality.  Acooidlng  to  D-S  theory,  a 
support  function  la  defined  for  single  propoallloits  as 

spt(Uj)  ■  ro(Uj)  (3.2) 

and  for  more  complex  proposltloru  as 

apt(A)  -  Z  m(B)  (3.3) 

BC  A 

where  'C  Indicates  subset.  The  plausibility  function  Is  defined  as 

plsfu^l  -  1  -  apUu^l  (3.4) 

where  Uj  Indicates  the  negation  of  proposition  i^.  Alternatively,  the  plauslblllQr  function  for  a  proposition  u^  is 

obtained  by  aummlng  the  masses  of  all  the  dlsjunctfoiu  that  contain  .  Including  Itself.  Le. 

pisfuje  Z  ffl(A)  (3.5) 

U|C  A 

Hence,  the  support  function  is  Indicative  of  liow  much  evidence  Is  offered  In  support  of  a  given  proposition  by 
all  the  propoaltlona  that  relate  to  It  Furthermore,  the  plausibiliiy  fiuicdon  Is  indicative  of  liow.  likely  It  is  for  a 
given  proposition  to  have  generated  the  data. 

Evidence  from  different,  and  (ndepsrslent.  sources  defined  over  the  same  frame  of  dlsoernment.  Is  fused  according 
to  Dempster's  combining  rule  (Oepm  '681 

Z  m  ( A.  )  nu  (  B  ) 

A,B,  -  u,  J 

mfu  )  ■  m.  e  m-  ■  (3-6) 

‘'an' 

*kV  ♦ 

wliere  m^  and  m^  designate  the  support  (bellel)  functloru  from  the  two  different  sources  of  evidence  defined  over  tlie  same 
frame  of  dlscemrmiL  u^  Is  the  proposition  toward  which  evidence  Is  sought  and  'S'  Is  the  empty  set  (Sliaf  '7Bi. 

Ifenormallxatlon  of  the  combined  evidence  in  rule  (3.6)  la  required  to  reject  evidence  that  corresponds  to  conflicting 
prop'Aitloru.  The  D-S  combining  rule  can  be  Implemented  In  a  tabular  foslilon  tliat  resembles  that  of  CEP  theoiy  (Thom  89. 
'90|.  To  Illustrate  the  meclianlcal  similarities  that  exist  between  the  Dempsters  combining  rule  and  the  CEP  DDF. 
consider  a  simple  binary  hypothesis  testing  problem.  If  the  frame  of  discernment  Is  defined  as  (u.  ■H..U,  >H.  .u.  ■  H. 
or  H,  } .  with  Ui  indicating  the  inablitty  to  associate  evidence  from  the  data  with  a  definite  hypothesis,  the  Dempster’s 
combining  rule  for  huo  sensors  can  be  implemented  using  Table  ID.  In  Table  ID.  k  designates  evidence  associated  with 
conflicting  propoatUons  which  is  used  as  normalizing  factor  in  (3.6  ).  The  combined  evidence  Is  calnilated  by  summing 
all  the  product  terms  from  Table  m  that  result  to  the  same  Intersection  proposition,  and  iwimallzlng  the  result.  In 
multiple-source  evidence  combining,  rule  (3.6)  Is  repeated  sequendally  until  the  evtdence  from  all  sources  Is  exhausted. 
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sa 

SI 

•i  . 

a^(u.) 

m^(u.  1 

ni^(Ui) 

m^lu. ) 

m(u.)-m|(u.)  0^(11.) 

k-mjlu.)  it^fu, ) 

mfu.  l-n^  (u. )  ii)|  (u, ) 

m^lu,  ) 

k-m|(u, )  n^(u.) 

mfu.  )-mj(u.  )n^(u, ) 

m(u,  )-ii^  (u,  1 (u, ) 

m^tu,  1 

m(u.  )-m|(u,  In^lu. ) 

m(u,  l-m^  (u,  In^fu  ) 

mfu,  )-ii^  (u, )  n^  (u. ) 

The  dUferencc  bctewen  the  0-S  and  B^fcalan  cheoiy  la  that  the  probablhqi  aeaUnmmta  Ibr  the  ,>roposiUons  m 
the  frame  of  disoemiiient  of  the  O-S  theoiy  do  not  aatia^t  the  fundamental  aHom  of  (Baycalan)  probability,  namely 
PfA^Bt  -  P(A)  «  P(B|  -  PtABI  (3.7) 

In  the  O-S  oonlexL  the  pfopoatdon  A«B  la  vteaied  aa  a  aeparate  enSQr  In  the  frame  of  dlaoemment  aral  can  be  assigned  an 
arbitrary  piobablll^  mass.  SOU  all  the  probability  assignments  In  the  D-S  theory  must  add  up  to  one  or  some  posiuve 
quantity  less  than  one.  with  the  icmalnlng  ptoba^ty  mass  to  add  dp  to  one  attributed  to  total  Ignorance  IShaf  '76|.  A 
correspondence  between  the  propoalltons  as  defined  In  the  O-S  theory  and  the  declalons  as  defined  In  the  mula-level  logic 
Bayesian  theoiy  can  be  established  If  the  dedstons  of  the  multi-level  logic  Bayesian  framework  are  Identified  with  the 
pioposinoiM  In  the  D-S  frame  of  dlacemment  Once  this  eoirespondencc  Is  established  the  fusion  perfcnnanoe  uitder  the 
two  approaches  ean  be  studied  under  common  oommunlcaaon  constraints.  By  disassociating  dedstons  from  the  hypotheses 
under  test  the  Ceneiallmd  Evidence  Processing  (CEP)  provides  a  aenanOeally  oommon  framework  within  which  the  N-P  and 
D-S  OOP  approaches  can  be  compared  under  common  ooiiununlcanon  oonatralnte. 

Due  to  the  aJhrenoe  In  the  way  evidenoe  U  generated  In  Bayesian  (N-P)  and  O-S  theoiy.  an  uiuondlOonal 
perfbrmaiKe  oomparlaon  between  the  two  theories  la  noc  In  generaL  feasible.  Since  in  a  lot  of  practical  applications 
the  perfbimanee  of  a  dedalon  making  system  Is  determined  fay  fiidng  the  false  alarm  profaablUty  and  maximizing  the 
detection  probability  at  the  fiiaioa  It  Is  meaningful  to  compare  the  Bayesian  arrd  D-S  approach  based  on  an  N-P  crltenon. 
In  order  to  make  the  oomparlaon  possible,  we  assume  that  the  baalc  probability  assignment  of  the  D-S  DDF  at  the  local 
level  Is  determined  using  the  likelihood  function,  l.e.  we  assume  that 

m(alr)-P(alr)  (3  8) 

where  a  designates  a  proposition  towruds  which  evidence  is  provkled.  and  r  the  observations.  Even  when  the  bpa  is 
resolved  at  the  local  ImL  the  decision  rule  at  the  fusion  after  the  local  evidenoe  Is  combined  remains  undetemuned. 
In  order  to  keep  the  dedalon  rule  In  a  D-S  context  while  maintaining  a  basis  fer  oomparlaon  with  the  Bayesun  DDF.  the 
dedalon  rule  that  will  be  used  fer  the  D-S  DOF  will  assign  the  data  to  the  proposition  that  has  the  highest  support 
among  all  prcposltlona  In  the  frame  of  dlacemment  that  correspond  to  definite  hypotheses.  Le. 

d(r)  ;*  d|(r) :  max  adp(d|)  and  d^  ■  H^.  I  over  all  single  Itypothesis  propostdons  13.9) 

With  the  above  assumptions,  we  prove  the  feUowtng  theorem. 

Thaoisai  I  Mtumr  that  the  objective  of  the  fusion  Is  to  maxlmlae  the  detection  probability  after  fusion  for 
fixed  felse  alarm  probability.  Let  the  observations  of  the  local  sensors  be  Indepenilent  bum  each  other  conditioned  on 
each  hypothesis.  Let  the  bpa  fer  the  D-S  DDF  be  determined  by  the  kkellhaod  function  (3.8)  at  the  local  level.  If  the 
fusion  rule  Is  the  rule  (3.9)  above,  then: 

(a)  If  the  local  frame  of  diacemment  coincides  with  the  hypotheses  under  test  l.e.  no  unions  of  hypotheses  are 
used  as  basic  propositions,  the  performance  of  the  D-S  DDF  is  the  same  as  the  centralised  N-P  (Beyeslan)  fusion. 

(b)  if  eornpeund-hypotheaes  propositions  are  allowed  In  the  local  bpa.  then  the  performance  of  the  D-S  DDF  is  always 
Inferior  to  the  centzallasd  N-P  frialra  and  the  distributed  N-P  fusion  for  the  same  communication  overhead. 

Bnl  We  prove  the  theorem  for  the  case  of  two  sensors  and  blnaiy  hypotheses  testing.  A  generalizauon  of  the 
proof,  although  notaUonally  Involved,  does  not  present  any  conceptual  dlfBcultlcs  and  as  such  Is  omitted. 

Past  (a)  According  to  the  assumptions  of  the  thMrem.  the  bpa  Is 

m(H|) PrfHj  I  r)  -  (p(r  I  Hj)Pr(i^)I  /  p(r)  :  1-0.  I  (3  101 

and  so  the  D-S  requirement 

m(H.)vm(H)«l  (3  >  U 

is  satisfied.  Using  the  Dempster’s  combining  rule  ( )  for  two  sensors,  we  obtain 

sup(H. )  -  (nf  (H,  Ini'  (H. ))  /  (  1  -  m'  (H.  |m'  (H. )  -m'  (H.  )m>  (H, )  )  (3  121 
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when  the  dMeton  le  the  reeult  of  renonneJtatton  due  to  the  toumat  of  oonfiicctng  evidence  after  fiieton.  and 
the  euperechpia  Identic  the  eeneore.  A  etmller  expceeeton  le  obtained  fer  the  H.  faypotheels  If  the  indeaes  In  ( )  are 
ewitchad.  The  prepoeed  derlelon  rule  (3.91  tnnelaiea  to 
H. 

eupCK  )  /  eupOl. )  ^  t  (3.13) 

n* 

were  t  la  aome  threahold  to  be  determined.  Taklnf  Into  account  that  for  thia  parOcular  caae  the  0-S  rule  yielda 
aup(H|)  ■  mfHj)  (3. 14) 

and  ualng  repteealen  (3.3).  t)ie  0-S  dedatoa  nile  ^vea  after  aome  etemeniaiy  algebn 

H. 

lp(r,  IH.  )p(r.  IK  )|  /  (p(r.  IK  )p|r.  IK  )1  *  t  (3.i5a) 

H. 


(p(r,  IK)p<r,  IK)]  /  (p(r.  IK)p(r,  IK)|  ^  t 

K 


(p(r,  IK)/p(r.  IK)1  (p(r.  IK)/p(r.  IK))  *  t 

H. 


(p(r,  IK  )p(r,  IK  )  - 1  p(r,  IK )p(r,  IK )]  ’  0  (3. iSd) 

H. 

which  la  pred-ely  the  oentraUaed  Bayealan  N-P  teat.  Thua.  the  peiformanoe  of  tite  D-S  OOP  In  tlUa  caae  la  Identical  to 
tlie  optimal  oentraUaed  Bayealan  OOF  for  the  aame  (alae  alarm  probablliqr  at  the  fuaion. 

Piwt(k)  In  the  bbiaiy  Iqtpotlieaea  teatinf  caae  the  only  compound  pmpoalOon  In  the  frame  of  dlaeemment  is  { H. 
or  H,  ) .  If  we  aaaurae.  without  loaa  of  generally,  that  t)ie  bpa  for  the  three  peopoatuona  la  done  by  aubtracOnf  an  equal 
amount  of  prababtliqr  from  the  two  propoalUona  that  coireapond  to  the  de^te  (lypcitlwera  and  eeanrlatlrn  It  with  the 
compound  propoaltlon.  the  following  bpa  icaulta 
mj(K )  •  PifK  Itj)  •  e(rj)/2 

mjW. )  ■  Pr(K  Irj)  -  e(r|)/a  (3. 16) 

mj(K  or  K  )  •  «(rj)  ;■ 

where  the  probabtUqr  maaa  t(r|)  can  be  data  dependent  Uaing  the  Oempater'a  combining  rule  to  Itiae  the  evidence  and 
auppreaalng  the  espbctt  dependence  of  on  the  data  for  notatlonal  elmpUcliy.  we  obtain  the  Ibllowing  expressions  for 
the  support  Ainctlon  reguding  the  two  hypotheses. 

sup(K )  ■  ( mi  (K  Inii  (H. )  -»  l/2(ei  m.  (K )  ♦  c,  m  (K)l  -  3c,  c.  /4  |  /  [  1  -  oonlUcdng  evidence  |  (3. 17a) 

and 

aup(H, )  ■  (m  (H,  )nv  (H, )  V  l/2|ci  m,  (H, )  v  c,  nt  (H,  )|  -  3t.  c,  /4  ]  /  ( 1  •  conflicting  evidence  ]  (3.17b) 

from  which  the  assumed  declslen  rule 
K 

sup(K  )  /  aup(K )  *  t  (3. 18) 

a 

yields 

(p(r.  IK)p(r,  IK)  -  tp(r.  IK)p(r,  IK)) 

H, 

♦  l/2((c.p(r,  IK)*e.  p(r.  IH, ))  -  tlp(r.  IK)^p(r.  IK)))^  3t.  c, /4(l-t|  (3.19) 

H» 

By  comparing  the  dedston  rule  (3. 19  with  the  opamal  N-P  test  rule  (3.  ISd).  it  Is  seen  tlwt  the  first  term  In 
brackets  in  the  left  side  of  (3.19)  la  Identical  to  the  term  in  the  left  side  of  (3.15d).  Since  the  decision  rule  (3.15d) 
la  fflg  optimal  decision  rule  In  the  N-P  sense,  rule  (3. 19)  would  achieve  opdmal  peiformanoe  if  and  only  (f  the  rest  of 
tlie  terms  In  (3. 19)  could  be  made  Identically  equal  to  aero  for  a  fixed  threshold  t.  However,  even  with  data  dependent 
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bp*  aMl^uaHtt  t|(ri.  thi*  la  not  poaalbte  in  tanen? .  Tbua.  tha  paribnnaaea  of  the  O-S  OOP  la  Inferior  to  the  opUinal 

centnJfead  N-P  htatoo.  PlirtlMrmoie.  alnoe  the  perfermance  of  the  dlatrlbutad  N-P  dedattm  tiialon  can  be  arburarily 
cloaa  to  the  apdnial  eaatnJfead  one  (TVB  '87.  Thom  '901  by  aimply  lachMltni  aoma  additional  quahty  Infermadon  bits 
along  with  the  darlalona  or  by  Inaiaaalng  the  number  of  quanomion  kwaia.  tha  perfermance  of  the  N-P  OOF  la  always 
superior  to  the  peifemianer  of  the  O-S  OOP  for  a  laaacr  amount  of  ronuminlradon  icqulicmenta.  Nodoa  that  in  the  O-S 
cither  the  data  Itaaif  haa  to  be  mnaadttod  from  the  senaors  to  the  featon  (which  la  the  moat  efliclent  way),  or  the  bpaa 
must  be  transmitted  thus  making  the  communlcaQonrequljenientaproporttenal  to  the  numbcrofpropoatttena  In  the  franw  of 
dlsoemmenL  ( Clearly,  a  quanttaed  verafen  of  the  data  or  bpaa  can  be  tianamitted  reaulOng  In  reduoion  of  oommunlcailon 
requlremcatt  and  perfemianee  aa  watt. ) 

The  above  argumenta  CKlend  aaatly  lO  multiple  senaorcaae.  The  general  maltl  bypathaefe  case  can  be  handled  in 
a  similar  any  as  the  two  hypothesis  case,  only  the  cspresalons  become  mere  complicated.  a 

To  compare  the  oonalstenqr  of  CEP  and  O-S  evldcnoe  combining  rules  13.2.  2.31  and  (3.6)  respectively,  the 
following  eapertment  wws  oonducled.  Numerical  results  have  been  obtained  far  binaiy  and  leniaiy  hypothesis  testing  and 
for  dlstrlbuUon  baaed  as  wall  as  arbitrary  bpa's.  HoNvevm.  due  to  hmitad  apace,  lasulls  horn  the  binaiy  hypothesu 
testing  wtU  be  piaaented  only.  For  addltforial  results,  the  reader  la  referred  u  lOalu  '90  and  Oa  '90|.  The  binary 
hypotheaia  taadiig  resulta  vrtll  be  presented  Oiat  For  CEP.  conditional  probafaUtUes  at  the  (baton  center  were  obtained 
In  the  same  maiuier  aa  in  previoualy  discusard  aimulaOona.  The  conditional  probabiUtlae  at  the  sensor,  horn  the  CEP 
simuladon.  were  used  as  the  origbial  proliabillty  aaaigriments  at  the  sensor  lor  the  O-S  theory  aimulalton.  Conditional 
p"»*»»*v««y  — ■■■■  ■*—  «» tti»  xiaatuntog  wiU  ihe  Conditional  probabilities  (him  CEP 

and  the  conditional  pmbabiliqr  meeera  (run  O-S  theory  acre  then  used  to  ralnilatr  conditional  plausibiliqr  according  to 
(3.S).  The  reauha  were  obtained  for  a  Cslae  alarm  probability  of  .OS  at  the  sensor  and  .006  at  liislM 

Figures  4  and  S  dispiqr  results  Cor  Cauaslan  and  ihyleigh  distributed  signals  respectively.  Both  paphs  show 
the  plausibility  oondttioned  on  hypothesis  H  for  flve  and  ten  senaota.  Tb  eompaie  the  two  combining  rules  for 

oonalateney.  a«  deSne  the  croaaomr  point  as  the  SNR  level  above  adiicfa  the  plauaiblllhf  for  the  eoncct  hypothesis.  H  . 

beoomes  feater  than  that  for  the  Incorrect  iiypothesls.  K  .  Observe  that  for  both  the  8vc  and  ten  sensor  cases  the 

crossover  point  occuta  at  a  lower  SNR  for  CEP  than  for  O-S  theory.  So  CEP  works  correctly  for  a  wider  range  of  SNR  than 
does  O-S  theory.  Also  noOce  the  behavior  as  the  number  of  senaoia  Inercaaea  from  five  to  ten.  For  CEP  the  croasower 
point  movea  to  loawr  SNR  wliife  for  O-S  theory  It  does  not  mewc  at  alL  This  Indicruea  that  we  can  Improve  the  performance 
of  CEP  by  Increasing  the  number  of  sensors,  which  la  a  very  desirable  (eiUuro.  The  perfeimanoe  of  O-S  theory,  on  the 
other  hand  does  not  Improve  when  the  number  of  sensors  incieaaes. 

Figures  6,  7  show  unconditional  plausibility  plou  for  the  Cauaslan  and  Rayleigh  cases.  More  specifically  they 
show  the  uncondlttonal  plausibility  for  the  correct  and  tneorreet  hypotheses.  Once  again  the  results  are  shown  for  both 
five  and  ten  sensors.  We  see  that  for  *U  cases  the  plausibility  far  the  correct  Itypotheais  la  higher  at  lower  SNR  for 
GEP  than  that  for  D-S  theory.  The  aeparatlon  between  plauaibtilty  for  correct  and  Inconect  hypotheses  la  much  dearer 
for  CEP.  In  (act  at  very  low  SNR  O-S  theory  (ails  to  separate  the  plausibility  for  the  conett  Itypotheais  from  that  of 
the  Inconect 

Cenclosiwas 

The  two  miqor  evldenoe  processing  theories,  namely  Bayesian  and  Dempster-Shafer's.  are  presented  aa  applied  to 
the  probkmofOtstrlbuiedOecisian  or  Evidence  Fusion.  Some  ofthelundaniental  resulta  In  Bayesian  and  Neyman-Pearson 
OOP  are  presented.  It  Is  shown  that  a  generaUcadon  of  the  Bayesian  DDF  uaing  mulU-levcl  logic  at  the  local  processor 
can  provide  a  framesmikthatailowseoiiiparlaon  of  the  performance  of  the  Bayesian  and  0-8  DBFs  under  certain  conditions. 
To  liukt  extend,  a  theorem  Is  devebped  that  shows  that  If  the  objective  Is  to  maxunise  the  detection  probability  at  the 
fusion  for  fixed  folae  alarm  probability,  the  Bayesian  ODO  outperforms  the  D-S  DDF  when  multi-level  logic  Is  used  locally. 
Le.  at  the  sensors. 
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2.  In  the  binary  hypothesis  testing,  this  Is  If  the  mass  that  Is  associated  with  the  compound  decision 

{ H,  or  H. )  is  removed  entirely  from  the  pro6«.btlity  mass  of  one  and  only  one  of  the  two  other  definite  decisions. 
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Abstract  -  Tha  problam  of  aatinating  tha  position  of  and  tracking  an  objact 
undargoing  3-0  translational  and  rotational  laotion  using  passiva  and  activa 
sansors  is  considarad.  Tha  passiva  sansor  usad  in  this  study  is  a  starao 
camara,  wharaas  tha  activa  is  a  ranga  radar.  Thraa  diffarant  astiaation 
approachas  ara  considarad.  Tha  first  involvas  astiaiation  of  tha  objact 
position  by  diract  ragistration  of  starao  imagas.  In  tha  second  approach,  tha 
Extended  Kalaan  Filter  is  usad  for  astimation  with  naasurwaants  tha  starao 
inagas.  In  tha  third  approach,  an  integral  filter  based  on  stereo  iaagas  and 
ranga  radar  measuramants  is  usad  for  tracking.  Tha  thraa  diffarant  approachas 
ara  compared  via  simulation  in  the  tracking  of  an  object  undergoing  a  3-0 
laotion  with  random  translational  and  angular  accalaration. 

1.  zMnooocTxaH 

Object  positionning  and  tracking  using  data  from  passiva  sansors,  such  as 
cameras.  Infrared  (IR)  sensors,  etc,  is  a  common  problam  in  robotics, 
automated  manifacturing,  space  navigation,  and  surveillance.  However,  in 
ordar  to  be  able  to  track  an  objact  undergoing  3-0  motion  using  camera  iaugas 
one  must  recover  depth,  a  missing  dimension  from  a  2-0  image.  Hence,  in  ordar 
to  retrieve  tha  position  of  an  objact  in  tha  3-0  space  a  means  to  recover 
depth  is  necessary.  In  this  study  wa  assume  that  stereo  vision  [2]  is  used  at 
first  to  enable  the  recovery  of  the  depth  from  a  sequence  of  "stereo"  images. 
A  problem  associated  with  the  use  of  stereo  images  is  the  matching  of  pixels 
from  right  and  left  images  with  the  correct  points  on  the  object.  In  order 
to  measure  the  depth  of  a  point  on  a  3-0  object,  a  point  on  the  right  image 
must  be  matched  with  a  point  on  tha  left  image  screen.  A  matching  algorithm 
which  is  a  modification  to  the  algorithm  introduced  in  [5]  was  usad  for 
registration.  Using  tha  stereo  camera  images,  the  position  and  the  velocity  of 
an  object  ware  estimated  using  two  different  smthods;  first,  by  direct 
ragistration  of  tha  stereo  iamges;  and  second,  using  an  Extended  Kalman 
Filter.  Earlier  work  on  the  use  of  the  Kalman  Filter  for  object  tracking 
includes  that  [4].  However,  in  t4],  a  single  camera  was  used  to  estimate  tha 
position  of  an  objact  undargoing  pure  translational  motion  with  depth  assumed 
to  be  constant  and  known. 

The  noise  associated  with  the  observations  on  the  image  screens  has  to  be 
filtered  out  in  ordar  to  achieve  accurate  estimates  of  the  position  and  the 
velocity  of  the  object.  The  transformation  aquations  fr«D  3-D  to  2-D 
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incroduc*  nonlln««sitias  in  th«  observation  model  and  thus  the  Extended 
Kalaaan  Filter  (EXT)  that  allows  for  nonlinearities  in  the  estimation  model 
must  be  used.  Zn  order  to  improve  the  accuracy  of  the  position  estimates, 
the  optic  flow  [3]  was  initially  used  along  with  the  position  on  the 
image  screen  as  additional  measurement.  The  use  of  the  optic  flow,  however, 
did  not  seem  to  improve  the  performance  of  the  estisution.  Consequently,  we 
decided  to  omit  the  optic  flow  from  our  analysis.  Instead,  we  decided  to 
use  an  additional  active  sensor  to  improve  the  accuracy  of  the  tracker. 

Thus,  a  range  radar  was  used  to  estimate  the  object  depth  separately.  The 
depth  estimate  was  combined  with  the  stereo  camera  images  using  an  EKF  to 
estimate  the  object  position  and  velocity  in  the  other  directions. 

2.  BtTZlttSZCM  EEtm  OM  DZMCT  RUZSmAXZOM  OT  tmiO  ZMUSBS 
2.1  The  Matching  Algorithm 

Given  the  stereo  camera  setup.  Fig.  1.1,  with  2d  the  distance  between  the 
two  cameras  (assumed  kno«m) ,  and  f  the  cameras  focal  length,  the 
transformation  from  a  3-0  point  with  coordinates  (x,  y,  z)  to  the  left  image 
point  (x',y')  and  the  right  image  point  (x‘',y”)  is  given  by  [2] 


f (x-d) 

f  (x+d) 

ty 

■  -  , 

f-z 

X"  -  - 

f-z 

•  F 

y'  -  y"  -  - 

f-z 

(2.1) 

» 

From  the  right  and  left  images  the  depth  z  can  be  recovered  using  (2.2) 

2df 

z  -  f  -  -  (2.2) 

x“-x' 

In  order  to  recover  the  depth  from  (2.2),  the  pixels  from  the  right  and 
left  images.  Fig.  2.1,  must  be  registered  first  correctly.  In  order  to 
register  the  two  stereo  images,  a  point  from  the  object  must  be  matched  with  a 
point  on  each  one  of  the  two  linages.  A  matching  algorithm,  similar  to  the  one 
introduced  in  (5J,  is  used  to  find  the  most  likely  match  between  points  on  the 
right  and  left  images.  The  algorithm  is  based  on  two  assuziptions:  1)  each 
point  in  an  image  can  only  have  one  depth  value;  and  2)  a  point  is  very  likely 
to  have  a  depth  value  near  the  values  of  its  neighbors.  The  slightly  modified 
version  of  the  algorithm  (1]  is  given  by 

Cn*l  (x,y,d)  (x* ,  y'  ,d'  I  -C^Cn  (x*  ,y'  ,d' '  *^00  (x*  ,y'  ,d' )  (2.3) 

x'.y'.d'  «  S  x'.y'.d'  «  0 

where  S  corresponds  to  the  excitatory  region  and  0  corresponds  to  the 
inhibitory  region.  The  constants  c,  c,  and  p  are  arbitrary  design  parameters. 
The  function  C  is  given  a  value  of  one  if  a  specified  threshold  is  exceeded 
and  a  zero  otherwise.  The  sigmoid 

*xp{nx)  -  exp(-nx) 

sigm(x)  -  -  (2.4) 

exp(nx)  *  exp(-nx) 

is  used  to  smooth  out  the  thresholded  output.  The  excitatory  and  the 
inhibitory  regions  are  illustrated  in  Figure  2.2.  The  eight  excitatory  points 
have  the  same  depth  as  the  point  of  interest.  If  sons  of  the  inhibitory 
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poinca  ar«  on  chla  will  tand  to  Iceap  tha  point  of  intaraat  turnad  off,  ainca 
only  ona  dapch  valua  can  oa  aaaignad  to  a  point .  Anothar  iaportant  aaaunption 
in  thia  matching  algorithm  ia  that  both  camaraa  ara  abla  to  aaa  tha  axact  aaaa 
part  of  tha  objact.  Thia  maana  that  thara  ara  no  pointa  on  tha  objact  that 
ara  aaan  by  only  ona  of  tha  two  camaraa. 

2.2  Modal  of  Tranalational  Motion 


In  ordar  to  taat  tha  ability  of  tha  mathcing  algorithm  (2.3)  to  aatimata 
tha  poaition  of  an  objact  undargoing  3-D  tranalational  motion,  a  aaquanca  of 
atarao  imaga:  wara  ganaratad  uaing  tha  modal  of  a  random  accalarating  objact. 
Tha  continu.i  .i-tima  dynamica  of  tha  objact  with  random  aecalaration  ara 
daacribad  by  tha  atata  aquation 


i(t) 


0  1  0  0  0  0 

0 

0  0  0  0  0  0 

1 

0  0  0  1  0  0 

0 

0  0  0  0  0  0 

x(t)  + 

1 

0  0  0  0  0  1 

0 

0  0  0  0  0  0 

1 

w(t)  ,  whara 


X 

V* 

y 

Vy 

Z 

Vx 


(2.5) 


ia  tha  atata  vactor,  and  w(t)  ia  uncorralatad, zaromaan,  whita,  gauaaian 
noiaa  with  covararianca  q(t)d(t-T),  with  q(t)  a  0  for  all  t. 

Notica  that  tha  dynamical  modal  (2.5)  ia  choaan  to  ba  unatabla, 
conatituting  a  worat  caaa  taating  paradigm.  Uaing  (2.5)  and  tha  3-0  to  2-D 
projaction  aquationa  (2.1)  a  aaquanca  of  imagaa  wara  ganaratad,  from  which  tha 
poaition  of  tha  objact  was  aatimatad  uaing  tha  matching  algorithm  (2.3). 

2.4  Simulation 


Tha  modal  (2.5)  waa  uaad  to  daacriba  tha  3-0  motion  of  a  flat  thin 
aurfaca  that  waa  uaad  aa  tha  objact  in  tha  aimulation.  Tha  tzana  format  ion 
aquationa  (2.1),  (2.2)  wara  uaad  to  tranafotm  tha  poaition  of  tha  four  cornara 
of  tha  objact  into  pixala  on  tha  two  imaga  acraana.  All  pixala  on  tha  two 
imaga  acraana  locatad  inaida  tha  four  cornar  pointa  wara  alao  tumad  on.  Tha 
raaulting  two  imaga  acraana  wara  than  fad  into  a  matching  algorithm  (2.3)  in 
ordar  to  match  pointa  on  tha  two  imagaa.  Tha  matchad  pixala  wara  than  uaad  to 
gat  an  aatimata  of  tha  dapth  of  tha  objact  uaing  (2.2)  . 

Tha  diatanca  batwaan  tha  two  camaraa  waa  aat  to  ba  8  matara  ao  that  tha 
right  and  tha  laft  imagaa  wara  conaidarably  diffarant.  Tha  focal  langth,  f, 
waa  0.5  matara.  Tha  two  camaraa  wara  aaaumad  to  ba  moving  in  ordar  to  ba  abla 
to  "aaa"  tha  objact  at  all  timaa.  Tha  camaraa  mova  to  tha  moat  racantly 
aatimatad  (x, y)  location  of  tha  objact  batwaan  two  conaacutiva  iamgaa.  Tha 
camaraa  ara  not  moving  in  tha  z  direction.  Both  imagaa  hava  a  raaolution  of 
16x16  pixala.  Tha  aatimation  arrora  in  tha  x-diraction  ara  ahown  in  Fig.  3.1. 
Tha  ^atimata  in  tha  z  diraction  (not  ahown)  wara  claarly  tha  amat  inaccurata. 
Tha  main  raaaon  for  tha  poor  z  aatimata  ia  tha  low  raaolution.  Tha 
danominator  of  tha  z  axpraaaion  ia  aapacially  affactad  by  tha  raaolution, 
ainca  it  dapanda  on  tha  diffaranca  batwaan  tha  two  x  aatimataa.  Saa)cing 
improved  poaition  aatimataa,  tha  extended  Kalman  Filter  ia  conaidarad  next. 
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3.  UtXMkSZOM  BUD  OM  ■XTWPBD  HOiOM  rZLTIIl  MB  SmilO  CMWM 

Th«  position  snd  tho  volocity  of  tho  objoet  sto  ostiutod  givon  tbs 
obssrvstions  of  tho  location  of  tho  objoet  on  tho  two  ixugo  scroons.  Tho 
obsorvstions  aro  assviaod  to  bo  noisy.  Tho  noiso  is  intxoducod  fron 
inaceuxato  xoadings  of  tho  imago  sexoon  as  woll  as  fxom  low  imago 
roaolution.  Tho  nonlinoax  txansfoxmation  oquations  (2.1),  (2.2)  suggost  tho 
uao  of  tho  Extondod  Kalman  Filtox  (EKT)  [7] .  Tho  dynamical  modal  and  tho 
stato  voctox  axo  givon  by  (2.9)  and  (2.10)  xospoctivoly.  Tho  obsoxvation 
modal  fox  tho  EKF  was  obtainod  fxom  tho  txansfoxmation  oquations  (2.1), 
(2.2)  by  adding  noiso  to  account  fox  tho  moasuxamant  noiso  at  tho  camoxa  and 
oxxoxs  in  tho  xogistxation  of  tho  imagos.  Tho  EKF  moasuxomont  voctox  is 


s(t)  - 


f(x(t)+d)  /  (f-*(t)) 
fy(t)  /  (f-r(t)) 
f(x(t)-d)  /  (f-x(t)) 
fy(t)  /  (f-z(t)) 


+  w(t) 


whoxo  T(t)  is  uncoxxolatad,  zoxo  mean,  white  gaussian,  noiso  with  covaxianco 
x(t)6(t-T),  with  x(t)  ao  fox  all  t.  Tho  initial  conditions  fox  tho  state 
voctor  axo  calcen  to  bo  gaussian  with  moan  x<0)  and  positive  dofinita 
covaxianco  matxix  P(0)  .  In  (3.1),  f  is  again  tho  focal  length  and  2d  tho 
separation  botwoon  tho  two  cameras.  Assuming  constant  aceoloxation  during  the 
sampling  interval,  tho  disexoto  time  system  is  obtainod  fxom  (2.S)  : 

Pxocoas  Modal  Qbsorvatioa  Medol 


1  T  0  0  0  0 
0  1  0  0  0  0 
0  0  I  T  0  0 
0  0  0  1  0  0 
0  0  0  0  1  T 
0  0  0  0  0  1 


f(x^+d)  /  (f-x  ) 
k  k 

/  (f-*,) 

f(X|^-d)  /  > 

fy,  /  («-*,) 


whoxo  T  eoxxosponds  to  tho  sampling  time.  Tho  noiso  covaxianco  matrices  fox 
w^  and  V|^  xospoctivoly  axe  given  by  (3.3).  For  tho  EKF  oquations  see  [9]. 


T  T 


1/TO  0  0 

0  1/T  0  0 

0  0  1/T  0 

0  0  0  1/T 


e1 
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3.2  Siaulation 


Using  ths  discrsts  tin*  squstions  (3.2)  through  (3.5)<  ths  object  position 
and  velocity  were  estimated  using  the  EKF,  [1],  [7],  [9].  In  order  to  prevent 
the  object  from  moving  out  of  the  field  of  view  of  the  stereo  camera,  the 
camera  w^s  assumed  to  track  the  object  using  the  estimated  velocity  in  the  x,y 
directions.  In  the  simulation,  a  focal  length  of  O.S  meter,  a  sampling  time 
of  1.0  seconds,  and  a  spacing  between  the  two  cameras  of  0.1  meter  were  used. 
Assuming  that  the  s-coordinate  of  the  object  was  initailly  -500  meters,  the 
initial  field  of  view  is  100m  wide.  [9],  The  fields  of  view  of  the  two  cameras 
are  fairly  narrow  due  to  the  large  focal  lengths.  The  sampling  time  of  1.0 
second  implies  that  images  from  the  two  cameras  are  available  every  second.  A 
shorter  sampling  time  will  increase  the  performance  of  the  filter,  but  since 
the  processing  of  the  isuges  takes  considerable  computation  time,  a  trade  off 
has  to  be  made.  The  sampling  time  is  therefore  set  to  be  1.0  second. 

The  observations  are  generated  by  the  transformation  equation  from  3-0  to 
2-0  using  (3.1).  It  is  a.*sumed  that  the  points  from  the  right  and  the  left 
image  have  been  matched  previously.  The  filter  is  run  for  300  iterations  and 
the  state  error  along  with  the  diagonal  elements  of  the  error  covariance 
matrix  ,  indi  cated  as  "camera  model. ”  are  plotted  and  shown  in  Figures 

4. 1-4. 5.  The  parameters  q  and  r  are  constants  that  multiply  the  covariance 
matrices  Q  and  R  respectively.  Note  that  the  error  in  the  velocity  estimates 
is  very  small  while  the  position  error  grows  occasionaly  before  returning  back 
to  an  acceptable  range.  The  estimate  in  the  z  direction  is  the  most 
inaccurate.  This  is  due  to  the  nonlinear  transformation  equations.  The 
inaccuracy  in  z  affects  the  other  position  components  as  well.  The  resulting 
estimation  errors  are  fairly  large  and  biased. 

Since  the  z  term  introduces  large  errors  in  the  estimation,  the  filter  was 
run  with  fixed  z  and  Vt  in  order  to  observe  the  difference  in  the  estimation 
error.  The  resulting  state  errors  and  diagonal  error  covariance  elements  are 
shown  in  Fig.s  4. 6-4.7.  Notice  how  all  the  error  covariance  elements  reach  a 
specific  value.  The  state  errors  are  considerably  smaller  in  this  case.  In 
addition,  the  state  errors  average  out  to  zero. 

The  effect  of  the  nonlinearities  in  the  observation  equation  (3.2)  can  be 
studied  by  considering  the  Taylor*  s  series  es^ansion  of  the  b  vector  in  the 
EKF  given  by 

h(»)  •  b(*)  H(x)(x  -*)■*•  H.O.T.  (3.4) 

where  b  and  B  have  been  defined  previously  and  )1.0.T.  corresponds  to  higher 
order  terms.  The  higher  order  terms  are  neglected  in  the  filter.  The 
approximation  error  that  is  made  from  neglecting  the  H.O.T.  in  (3.4)  can 
subsequently  be  estimated.  The  nonlinearity  in  the  observation  equations 
(3.1)  comes  mainly  from  the  z  term  in  the  denosUnator.  Using  (3.4),  the 
nonlinearity  in  the  denominator  of  the  observation,  equations  can  be 

approximated  by 

111 

-  “  - - -  +  - (z  -  z)  +  error  (3.5) 

-  t  f  -  z  (f  -  z) 

from  which  an  approximate  expression  of  the  expected  approxiamtion  error  is 
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(3.6) 


(3.6)  gives  an  expression  for  the  error  made  in  the  approximation  of  h(x)  by 
the  linear  terms  in  (3.4).  The  error  is  plotted  and  shown  in  Fig.  4.8.  The 
error  is  relatively  small  but  introduces  a  bias  on  the  state  estimates. 

4.  Ditn  UTaoTzoir  TBHoaoi  a  mirb  mdmi 
The  model  in  section  3.1  produces  Inaccurate  eatimetes  of  the  object 
position  and  velocity.  The  estimation  error  in  the  z  direction  is  especially 
inaccurate.  It  was  seen  in  section  3.2  that  the  estimates  can  be  greatly 
improved  if  the  depth  z  were  known  precisely.  The  estimate  obtained  from  the 
stereo  camera  could  improve  if  accurate  estimates  of  the  depth  z  were 
available.  K  range  radar  is  used  to  estimate  the  depth  of  the  object 
separately.  Once  the  depth  is  estimated,  the  estimate  is  fed  to  the  camera 
filter  to  estimate  the  x,  y  components.  The  range  radar  is  introduced  in 
section  4.1  and  the  integration  of  the  range  radar  filter  and  the  camera 
filter  is  presented  in  section  4.2. 

4 . 1  The  Range  Radar  Filter 


The  range  radar  measures  the  distance  (range)  R  to  an  object,  along  with 
two  associated  angles,  the  azimuth  t)>  and  the  elevation  c  [8],  [9].  Using 
polar  coordinates  allows  us  to  perform  tracicing  in  the  system  from  which  the 
measurements  are  obtained.  The  transformation  between  the  polar  coordinates 
(R,  T),  c)  and  the  Cartesian  coordinate  system  (x,y,  z)  used  in  the  camera  model 
can  be  found  in  [8],  [9].  The  range  radar  filter  is  a  coupled  filter 
containing  a  range  part  along  with  an  angle  part.  The  angle  filter  consists 
of  two  individual  filters  for  the  two  angles.  The  state  vectors  are  given  as 
follows 
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(4.1) 

The  flow  chart  for  the  processing  of  this  coupled  filter  is  shown  in  Fig.  5.1. 
The  system  models  are  given  by 
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where  the  sampling 

index 

has 

(seen 

suppressed  for 

simplicity.  The 

measurement  models  are  of  the  following  form 
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The  transition  matrices  are 

defined 

as 

OD 


» 


9 


» 


» 


SPIC  Vol  1798 Stntor  Fusion  //:  Humin  »nd Moehino  Strmogios (79891  /  tS7 


•  •  •  •  •  •  •  •• 


(4.4) 


wh«r« 


wp  - 


V*  ♦ 

H  V 


C  -  1  - 


R  *  Rcosc 


Th«  observation  niatrleas,  ^  ,  h  ,  are  constructed  based  on  that  all  the 

entries  in  the  three  state  vectors  are  observable.  The  matrices  are  given  by 


The  error  covariance  matrices  for  the  model  and  the  observation  noise  have  the 
following  structure  [8]: 


,2  T*  T^ 

—  20  8 


R  -R-R- 

R  H  V 


The  linear  Kalman  filter  [T]  is  used  to  estimate  R,  t),  and  c.  The  transition 
matrices  are  updated  of  the  beginning  of  each  iteration.  The  estimates  of  R, 

Jl,  and  c  are  used  to  generate  the  depth  estimate  according  to  z  -  Rcost)cosc. 
4.2  The  Integrated  Filter 


The  estimate  of  the  depth  obtained  by  the  radar  filter  is  used  in  the 
camera  filter  to  help  estimate  the  x  and  y  coordinates.  The  integration  of 
the  two  filters  is  illustrated  in  Fig.  5.1.  It  is  assumed  that  the  target 
motion  can  be  accurately  modeled  as  the  motion  of  a  randomly  accelerating 
object.  The  actual  data  in  the  range  radar  filter  is  generated  from  the 
actual  model  through  the  transformation  equations  (4.3)  .  However,  in  the 
range  radar  filter,  it  is  assumed  that  the  data  is  generated  by  a  target 
undergoing  a  random  maneuver  during  the  interval  between  the  70th  and  the  90th 
time  step.  Thus,  an  intentional  mismatch  between  the  actual  model  and  the 
perceived  range  radar  model  is  introduced  to  tost  the  robustness  of  the  range 
radar  filter  and  the  integrated  filter,  [9].  The  estimate  of  the  depth  is  used 
in  the  transformation  equations  in  the  camera  filter  where  it  is  treated  as  a 
constant.  Thus,  the  resulting  Kalman  filter  is  linear.  The  cameras  are 
moving  as  described  in  section  2.4.  The  object  motion  is  strictly 
translational.  The  rotational  motion  is  covered  in  section  4.5. 

4.3  Simulation 

The  integrated  filter  in  the  previous  section  was  simulated  with  the 
following  parameters:  T*1.0,  f*0.5,  d*0.1,  q>0.01  (for  camera  filter) ,q-0 . 1 
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(for  ring*  flltor),  r  -  0.0068  (for  rang*  radar  filt*rl,r>  0.01  (for  caaara 
filter  and  angl*  filters) ,  100  (maneuver  tin*  constant  in  rang*  radar) , 

ir  -  1.0  (maneuver  standard  deviation) . 

m 

The  choice  of  a  lower  r  for  the  rang*  radar  is  based  on  the  assuoption 
that  observations  in  this  case  are  fairly  accurate.  The  rang*  radar  filter 
assumes  that  the  object  maneuvers  in  the  interval  laetween  70  and  90 
iterations.  The  parameters  that  are  associated  with  this  maneuvering  is  given 
above.  The  resulting  estimate  errors  and  the  related  error  covariance 
elements  are  shown  in  Fig.s  i.l-I.S.  Comparing  these  figures  to  the  figures  in 
section  3  it  is  easily  seen  that  the  errors  are  reduced.  The  errors  average 
to  zero  as  in  the  fixed  z  case  in  section  3.  The  elements  of  the  error 
covariance  matrices  behave  better  as  well.  The  error  covariance  elements  for 
the  rang*  radar  are  reinitialized  when  the  difference  between  an  element  in 
two  consecutive  iterations  is  smaller  that  0.001.  Hot*  how  the  errors  are 
decreased  every  tiai*  a  reinitializing  occurs. 

4.4  Estimation  Based  on  Mono  Camera 


Since  the  depth  in  the  integral  filter  is  estimated  with  measureaients  from 
the  rang*  radar,  the  us*  of  the  stereo  camera  seesis  redundant.  Comparison  of 
the  X-  direction  estimates,  similarly  in  the  other  directions,  obtained  with  a 
mono  camera,  Fig.s  5.3-S.4,  with  their  stereo  camera  counterparts,  indicates 
that  the  estimation  errors  and  the  error  covariances  are  higher  in  the  mono 
camera  case.  The  us*  of  a  stereo  camera  is  therefore  justified. 

4.5  Rotational  Object  Motion 


The  previous  models  have  assumed  that  the  object  moves  with  only 
translational  motion.  Naturally  an  object  very  rarely  moves  with  zero 
rotational  velocity.  In  this  section  rotational  object  motion  is  introduced. 

Initially  the  rotational  velocity  is  assumed  to  be  )cnown  and  constant. 
The  rotation  is  taJeen  into  account  in  a  modified  model  of  (2.5).  The 
observation  equations  remain  the  same.  The  z  and  zvel  estistates  are  fed  into 
the  easMra  filter  from  the  rang*  radar  and  will  be  treated  as  inputs  in  the 
camera  model.  The  resulting  discrete  model  is  then  given  by 
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(4.10) 


where  (WK,(iiy,ut)  are  the  luiown  constant  angular  velocity.  The  covariance 
matrix  Q«  of  the  noise  Mk  is  given  by 
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qAT  0 
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(4.11) 
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Th«  stat*  vector  x  is  given  by 
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(4.12) 


The  covariance  Q  that  is  used  in  the  filter  equations 


is  given  by 
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The  range  radar  model  is  the  same  as  before  since  it  already  incorporates 
constant  angular  velocities.  The  above  model  was  simulated  with  essentially 
the  same  parameters  as  i.*!  the  translational  case.  The  sampling  time  was  1.0 
second  and  q  was  set  to  0.01.  The  angular  velocities  were  all  set  to  0.011 
rad/ sec.  The  resulting  estimation  errors  and  the  corresponding  error 
covariances  are  shown  in  Fig.s  5. 5-5. 6.  The  estimation  errors  in  the  position 
are  basically  the  same  as  they  were  for  the  purely  translational  case,  whereas 
the  velocity  estimates  are  worse. 

Next  we  consider  the  case  of  random  angular  acceleration.  The  angular 
velocities  cannot  be  treated  as  constants  in  this  case.  Both  the  camera 
filter  and  the  range  radar  filter  have  to  be  modified.  Zn  order  to  avoid 
additional  nonlinearities  in  the  camera  filter,  the  angular  velocities  are 
estimated  in  the  range  radar  and  fed  into  the  camera  filter  just  lilie  the 
estimates  for  z  and  zvel  are.  The  augmented  state  vectors  in  the  range  radar 
are  given  by 
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where  (ua,uH,uv)  is  the  angular  velocity.  The  system  models  are  given  by 
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The  state  transition  matrices  are  defined  by 
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wh«r«  ail  th«  parasMtara  hav*  b««n  daClnad  praviousiy. 
matricaa  for  th«  model  noiae  have  the  following  atructure 


The  error  covariance 


A  A  A 

^  "  Ta 


0 


T_ 

4 

t’ 

r 

The  camera  model  ia  modified  in  the  following  way 
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where  (UK,wy,wt)  and  (W9i,wy,wt)  are  eatimated  in  the  range  radar  filter  wit), 
the  uae  of  the  tranaformationa  in  Appendix  B  in  (X],  [9]. 

The  model  described  above  waa  aioulated.  The  parameters  in  (4.11)  tiete 
used.  The  initial  values  for  the  rotational  state  vector  entries  Mmzu 
selected  as  follows: 

(ua,uH,wv)  >  (0.01,0.01,0.01)  (m,M,w)  *  (0.001,0.001,0.001)  (4.19) 

The  resulting  estimation  errors  and  the  corresponding  error  covarisnce.'j 
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for  ch«  object  position  in  tho  x-diroction  sro  shown  in  Fig.s  S.7-S.8.  The 
errors  sre  close  to  previous  results.  Figures  of  the  estiastes  in  the  other 
directions  and  in  the  associated  velocities  can  be  found  in  [1],  and  [9]. 
The  overall  perfomance  of  the  filter  degrades  when  the  angular  velocity 
changes  randomly  as  expected. 

CCMOASIOM 

Three  different  approaches  for  estimating  the  position  of  and  tracking  an 
object  undergoing  3-0  transaltional  and  rotational  motion  were  considered. 
One  approach  involved  a  stereo  camera  and  position  estisuition  directly  from 
stereo  image  registration.  The  second  approach  involved  a  stereo  camera  and 
use  of  an  Extended  Kalman  Filter  (EKF)  for  position  and  velocity  estimation. 
In  the  third  approach,  a  range  radar  was  used  to  estimate  the  depth  from 
separate  measurements.  The  depth  estimate  was  sxabsequently  used  in  an  EKF  to 
recover  the  object  position  and  velocity  (both  translational  and  angular)  from 
a  sequence  of  stereo  Images.  Numerical  comparison  of  the  throe  approaches  via 
simulation  indicates  that  the  range  radar  -  EKF  integral  filter  is  superior  to 
the  other  two  approaches,  Fig.s  4. 1-4.5.  Furthermore,  the  integral  filter  can 
tracic  successfully  objects  undergoing  3-0  translational  and  rotational  motion. 
From  the  simulation  results  is  seen  that  the  effects  of  random  rotation  are 
more  visible  in  the  velocity  estimates  [1],  (9].  The  position  estimates  were 
very  closed  to  those  obtained  in  the  purely  translational  motion  case.  The 
performance  is,  therefore,  not  affected  by  the  random  angular  accelaration, 
except  for  the  estimates  of  the  compound  velocities. 
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Abstract 

DIGNET  it  4  tai(-«iS4au>a«  uuiicttl  acitral  taiirark  I  ANN)  that  caktbiu  datcrmiaitucaliv  reliabie  bakavior 
10  noua  iBiariataaca.  wkaa  iht  noiae  doai  not  axcaaU  4  pta-tpactAad  leval  o(  tolaraace.  The  complesitv  of  the 
propoaed  ANN.  in  terma  of  tearon  tequitcmenu  venae  ttotcd  paiunt.  lactctaea  liaettlv  with  the  numtoet  of 
-lorcd  pttterna  and  thetr  dimentioaalitv  The  teif-or^amtation  of  the  DIG.VFT  it  bated  on  the  idea  of  competitive 
^enertiioa  and  eliouaauoa  of  allrociion  vri/j  in  the  pattern  apace.  OIGNET  it  ntad  for  Pattern  Recofniuea  and 
c'latathcatioa  and  for  Sisnal  Detection  and  Puiion.  Analytical  and  namcrtcal  reaulu  are  included. 


1  Introduction 

Mont  artificial  NN't  ( ANN's)  that  are  used  in  the  literature  for  pattern  recognition  and  cluaificatioa  require  that 
the  patterns  that  are  stored  and  recognized  be  orthogonal  with  each  other  ((Ij,  [2],  [3].  [-tj.  [5].  (C).  [7]).  Furthermore, 
they  are  usually  vulnerable  to  noise  interference,  in  the  sense  that  a  usually  small  deviation  from  the  orthogonality 
assumption  renders  them  unstable.  For  a  viable  neural-based  solution  to  the  recognition/classiricntion  problem  in 
the  presence  of  noise,  the  artificial  neural  network  must  be  designed  to  that  it  is.  by  design  and  not  by  incident, 
robust  to  prespecified  noise  margins.  DICSET.  the  artificial  neural  network  that  we  propose  for  automatic  pattern 
recognition  and  claaBtfication.  signal  detection  and  distributed  data  fusion,  reflecu  this  philosophy. 

2  Proposed  Artificial  Neural  Network  Architecture 

Ideally,  the  input-output  characterutic  of  an  ANN  that  is  used  for  pattern  recognition  and  classification  in 
cluttered  noise  should  resemble  that  of  Fig.  1  In  Fig.  1.  the  horisontal  curves  represent  Mttraction  wells”  arouna 
the  stored  patterns.  If  the  stored  patterns  are  identified  with  equilibrium  poinu  of  the  ANN  dynamics,  then  the 
attraction  wells  of  Fig.  1  represent  attraction  regions  around  thsM  points  in  a  multidimensional  space.  Thus,  if  the 
noise  IS  identified  as  a  percentage  disturbance  of  the  stored  patterns,  the  attraction  wells  represent  hyperspheres  of 
predetermined  radius  around  the  patterns.  So.  if  the  ANN  is  presented  with  s  distorted  pattern  that  lies  in  one  of 
these  attraction  regioiu.  correct  recognition  (and  claaification)  will  be  guaranteed  from  the  convergence  of  the  ANN 
to  the  correct  equilibrium  point.  If.  on  the  other  hand,  the  ANN  is  initially  presented  wiih  a  pattern  that  lies  outside 
any  of  the  attraction  regioaa.  a  new  attraction  well  will  be  created  and  the  A.NN  will  converge  to  the  snkwcwn  potitm 
as  It  should.  Thus,  an  ANN  with  the  characteristic  of  Fig.  1  exhibiu  learning  capabiliiies.  since  new  patterns  can  be 
stored  by  extending  the  attraction  points  of  the  operating  chamctcristic  in  Fig.  1.  Furthermore,  the  noise  tolerance 
of  the  ANN  can  be  changed  by  modifying  the  -width”  of  the  attraction  wells.  Dignet  dynamically  realises  the  ideal 
characteristic  of  Fig.  1. 


3  Directors  and  the  unity  hypersphere 

In  linear  system  theory  eigenvectors  have  only  meaning  as  directions,  their  magnitude  being  undetermined.  Any 
vector  that  lies  in  the  direction  of  an  eigenvector  of  the  system  is  also  an  eigenvector  independent  of  iu  magnitude. 
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Fifure  1:  idcai  charaeunnic  of  ANN  for  Automatic  Paturn  Racogmtioii  and  Claaaiiication 

On  the  other  naad.  a  pattern  is  well  defined  irrespective  of  scaitn;  or  reversal.  For  instance  we  can  recognize  a  visual 
shape  even  under  different  light  inwnsitics  i scaling),  even  if  we  see  the  photographic  negauve  treversall.  The  above 
e.xamples  can  motivate  the  conceptualisation  of  patterns  as  straight  lines  in  the  n-dimensional  space.  To  further 
understand  the  operation  of  Dignet  we  introduce  a  mathematical  entity  that  we  call  “director  ' 

Oofiaiuoa  3.1  .4a  n.dimcnsioaef  director  is  the  set  of  aU  leetora  ipnf  on  ike  same  stnufkt  hue  paamf  through 
Ike  ongia  of  an  n-dmesjiouaf  aeelor  apace.  We  ate  the  notatton  a.k.e.d...  to  radicals  directors. 

We  shall  prove  that  the  set  of  all  n-dimensional  directors  i  n-directots)  is  a  metric  space. 

DoAmttoa  3.2  For  two  n.direclors  a.  h  we  deffac  as  itaiance  6fa.h)  ike  ahsofsle  raise  of  the  ecstr  aegis  hstwesa 
aaf  tiro  of  their  sisrasats.  in  terms  of  tke  vector  space  it  caa  he  srpresssd  as 

■here  x.  y  rrrtora  so  (hat  x  €  a.  y  €  h 

It  IS  easy  lo  see  that  this  distance  fulfills  all  the  properties  of  a  metnc; 

1.  .N'onnegative  because  arccoa(x)  €  [0.  x/2]  for  x  €  (0. 1|  (CBS  inequality) 

2.  Symmetnc.  obviously  if  we  interchange  a  and  h  in  the  formula. 

3.  The  triangle  inequality  clearly  holds  for  the  3-dimensional  space  (with  equality  when  all  3  directors  lie  on  the 
same  plane).  However,  any  three,  non^llinear  vectors  tor  straight  lines)  spaa  a  3-diiiiensional  subspace  in  the 
n>tpace  that  is  homomorphic  to  the  3-D  space.  Therefore,  the  metric  properties  hold  for  the  n-dimensional 
space,  too. 

Thus,  the  set  of  all  n-direetors  is  a  metric  space. 

From  the  definition  3.1  it  follows  that  a  director  being  a  sot  can  be  represented  by  one  of  iu  elemenu.  A  good 
choice  IS  the  sany  vector  that  belonp  to  the  particular  director.  This  choice  simplifies  equauon  1.  If  .Y  and  Y  are 
unity  vectors  representing  the  directors  a  and  6  respectively,  then 

QiX.  Y)  =  arceosl  |  <  .Y.  Y  >  |)  (2) 

and  the  directors  caa  be  further  represented  as  poinu  on  the  surface  of  the  unity  kyperapkert.  In  figure  2  we  see  the 
3-dimcnaionai  case.  This  mapping  of  pattern  vectors  to  unity  vectors  can  be  achieved  by  normalization  and  reduces 
a  n-dimensional  problem  to  a  (n  -  I )-dimenaionai  problem. 
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FigUK  2:  Nomulucd  puurM  and  the  unity  tphatc 


The  topoiofieni  properties  of  this  mnppinf  arc  interesting,  tiowever  the  algebraic  properties  are  compiicateo. 
Therefore,  to  snnplify  thiup  we  assume  that  the  angles  are  smalt,  then  in  the  limit  the  surface  of  the  sphere  can  oe 
treated  as  a  tangent  plane.  Then  if  we  consider  a  neighborhood  around  the  pattern  P,.  where  d  (thotai 

IS  the  desired  angular  threshold  for  paturn  matching,  we  say  that  a  pattern  is  recognised  by  the  exemplar  P,  if  its 
projection  on  the  surface  of  the  sphere  falls  within  the  above  neighborhood. 

Sines  the  vectors  are  already  normalised,  the  angle  correspwids  to  the  inner  product  between  vectors,  and  the 
eompanson  of  a  new  paitem  with  a  number  of  prestored  exemplars  can  be  achieved  with  a  simple  parallel  vector 
matrix  multiplicatioa  and  thresholding  of  the  output,  where  the  rows  of  the  matrix  correspond  to  the  exemplars 


4  O«scription  of  Dignct 

Oignet  is  a  aalf-orgaaisiag  neural  network  that  can  store  and  classify  notsy  inputs  without  superviaed  training. 
Its  self-otganuatioo  capabUity  is  baaed  on  the  idea  of  competitive  generation  and  eiiminatioa  of  attraction  wells 
The  wells  are  generated  around  presented  patterns  which  are  clustered  according  to  their  distance  from  the  center 
of  wells.  The  center  of  a  well  is  moving  dynarmcally  towards  the  highest  concentration  of  clustered  poinu  in  the 
pattern  consteilatioo.  The  depth  of  a  well  indicatas  the  strength  of  learning  and  re6ecu  into  the  inertia  by  which 
the  center  of  the  well  is  moving  when  new  data  falls  within  its  region  of  attraction. 

A  Khematic  diagram  of  Oignet  is  shown  in  Fig.  j.  The  pattern  recogmtion  and  claasiiicauon  ability  of  Dignct 
IS  characterised  by  the  competitive  creatwo  and  eliimnation  of  attraction  wells.  Each  well  is  characterised  by  lu 
center,  sridth  (tbreshoid),  and  depth.  The  similarity  between  patterns  in  Oignet  is  measured  in  terms  of  the  angle 
that  the  patterns  form  among  themselves.  It  is  assumed  that  all  patterns  are  normalised,  so  that  the  magnitude  of 
a  pattern  does  not  affect  the  clessificitioo  capability  of  the  netwmk.  Assuming  that  a  number  of  wells  has  already 
been  created,  the  changes  in  the  Oignet  geography  once  a  new  pattern  is  presented  are  as  follows. 

Let  Xn  represent  the  pattern  that  is  presented  to  Oignet  at  the  n-th  time  instant.  If  Cn.i  tepresenu  the  cenur 
of  an  existing  weil  in  Oignet  at  the  time  the  new  pattern  is  presented,  the  center  changes  according  to 


+ 


with  initial  conditions  co  >  0. 


(3) 


where  d„_i  is  the  depth  of  the  well  at  the  n  -  1st  presentation,  which  is  updated  according  to 


dn  s  dn.i  +  Cn,  with  initial  craditions  do  x  0. 


(4) 


and  Cn  is  a  variable  that  takas  on  the  foifowing  values 

(1  if  the  pattern  is  won  by  the  well  (reinforcement) 

0  if  the  pattern  does  not  fall  in  the  weil  (no  interaction)  (3) 

- 1  if  the  pattern  falls  in  the  well,  but  is  not  won  by  it 
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Figure  .i  Schematic  diagram  of  DIGNET 
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The  width  of  a  well  (threshold)  determines  the  region  of  attraction  and  is  determined  by  the  specified  (desired) 
signal>to>noise  ratio  (SNR).  The  threshold  is  measured  in  degrees  of  a&gk  from  the  center  of  the  well.  Given  a  SNR. 
(he  threshold  (cosine)  is  determined  by 


threshold  s 


/ 


T7WW 


(6) 


and  the  well  width  (in  degrees) 


@0  =  arccos(thtesiioid) 


(7) 


» 


Equation  6  is  obtained  from  figure  4.  The  noise  component  that  contributes  to  the  angular  deviation  from  the  | 

center  of  the  well  s  is  normal  to  the  pattern.  Therefore  for  worst  case  analysis  we  can  assume  that  the  none  n  normal 
to  the  pattern.  If  we  cut  the  n-dimensional  hypersphere  by  a  2>dimenaional  plane  so  that  the  vector  s  lies  on  that 
plane  as  well  as  the  center  of  the  hypersphere,  then  we  reduce  the  problem  to  an  equivalent  2-dimensional  problem. 
Then  <  n.  n  >  •i-l  s  A*  =<  s  •)■  n. s n  >  by  the  Pythagorean  theorem.  Then 


cos(e)=  1/A  = 


1 


(8)  • 


By  using  <  n.  n  >3  o*  (in  expected  value  sense),  we  obtain 


coo(e)3 


from  which,  using  the  relation: 


(6)  follows. 


(9) 

» 

(10) 


» 
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Figure  4:  2-dinicnsiooal  projection  of  putem  end  normal  noue 
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Once  a  pattern  la  preaented  to  the  ncterork.  lU  distance  from  the  diflstent  wells  is  computed.  If  the  minimum 
liiatance  exceeds  the  well  width,  a  new  well  is  created;  otherwise  the  pattern  is  asattned  to  the  closcat  well,  which  is 
reinforced.  Furthermore,  if  the  patwm.  in  addition  to  falling  in  the  region  of  attraction  of  Its  closest  well,  falls  in 
ihe  regwo  of  attraction  of  other  welia  as  well,  thcas  wells  are  weakened,  their  center  is  pushed  away  and  their  depth 
decreases  according  to  the  above  equatioos.  To  avoid  excessive,  spurious  wells,  a  sispc  sfc  is^l  is  deliaed.  The 
depth  of  each  welt  is  periodically  e.xamiiied  at  the  end  of  each  s.n..  If  at  the  end  of  a  s^.  the  depth  of  a  well  does 
not  exceed  a  certain  threshold  (age),  the  well  is  eliminated  all  together;  otherwise  it  survives  this  stage  age. 


5  Stability  and  Convergence  Anal3rsis  of  Dignet 
For  reasons  of  analytical  compact  ness,  we  perform  a  stability  and  convergence  analysis  of  Dignet  by  using  the 
eonunuous  time  equivalent  of  the  self>organising  algorithm  (equations  3  through  7).  Simple  manipulation  of  the 
diserete-iime  algorithm,  yielda  the  following  eonunuous  time  algorithm: 

l{ei«)d,«))  =  l{d,(l)l*(l)  (11) 

^djfOla  f(eo-0(e<(O.x(f))>O)(2f|e{e<(O.cU))*niin<e<f;(l).z«))))-  1)  (12) 

at  j 

where  r,(()  designates  the  center  of  the  (•ih  well  in  Dignet  at  tune  r.  d^ft)  the  associated  depth,  and 


6(ei,x(r))  s  arccos 


<  ri(0.j«)  >  N 


(13) 


/[Q]  is  the  indicator  functioo  deiaed  to  be  one  if  the  event  Q  is  true,  and  zero  otherwise.  The  minimum  in  12  is 
understood  over  all  existiag  wslls  in  Dignet  at  time  <. 

Aaaomtag  zero  initial  conditions  on  d(0).  i.e.  d(0)  s  0.  the  solutioa  to  the  diisreniial  equation  11  is 


e(()d(<)s  j  d(r)x(r)dr  (14) 

where  the  notation  "dfi)'  is  used  to  indieau  the  time^erivative of  d(l).  Aasnming  d(()  ^  0.  and  using  the  convention 
§  0.  the  solution  14  can  be  written  as 


!^lir)x(T)dT  _f'd,iT)z{r)dT 
d«)  /o^(r)dr 

For  the  t-th  Dignet  well  with  center  ei(().  the  integral  in  the  denominator  of  (15)  represenu  the  average  time  that 
any  input  pattern  s(t)  fell  into  the  regioo  of  attracuon  of  weii  i  and  wen  by  this  well  (i.e..  it  was  closest  to  the  center 
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«,(()  lhan  to  any  oilier  well  cenieri.  iiunui  the  averafc  iime  that  any  oilier  paiiern  fell  inio  the  region  of  aiiraciion 
of  well  I.  bui  wM  lost  o\-er  lo  ronipeimon.  (The  convent.o>ii  5  s  0.  is  assumed  in  the  anaiysis.)  Thus.  ( I'/i  producer 
wells  «'ua  centers  the  Mtlteurr  t inie>avera|es  of  different  .^put  noisy  patterns.  Furthermore,  it  eliminates  wells  that 
are  creaied  from  overlapping  well  lioundaries  The  solutions  ( 15)  are  stable,  assuming  finite  mean  uata.  and  converse 
to  eiiber  a  time  average  if  the  pattern  persists  111  the  input  data,  or  tero  if  the  pattern  is  spurious.  Tlie  »tase- 
age  parameter,  s.a..  that  was  introduced  in  the  description  of  Dignet  facilitates  the  elimuutioo  of  unsustained  aiid 
undesired  spurious  welb.  in  order  to  keep  the  storage  capacity  requirements  of  Dignet  manageable.  The  alcoritbni 
(equatuMis  11  through  13)  or.  its  equivalent  discrete  time  version  (equations  3  through  7).  is  thus  capable  ofself- 
orgauuatioii  ana  can  be  used  in  a  neural  network  for  claas-diKriimnation  among  different  classes  that  are  separable 
by  liypersplieres.  Claeses  of  patterns  which  are  separable  by  more  complicated  boundary  shapes  can  be  discriminant 
by  Dignet  through  seif-orgauizatioii.  if  a  different  metric  is  used  to  determine  the  interaction  among  input  paiterii.< 
and  well  centers,  other  than  the  angle  metric  1 1 )  used  in  the  indicator  function  /[(6o-0(e,(/).r(t))  >  0)]  ui  1  lil 

6  Comparison  with  other  self-organizing  networks 

Kohoneii  f'Jl  lias  pro|KMrd  a  rlaas  of  self-organuing  feature  maps  that  are  baaed  on  the  adaptation  law 

=  olx.m.qixft)  “■‘(r.m.qimil)  ,l,ii 

at 

U(M  =  /n^(f)x(0  I  17) 

where  '/iM  represents  the  neuron  activation  tor  output  for  linear  elements).  x|t)  is  the  vector  of  the  inpui  excii.iiioii> 
to  the  neuron,  and  lult)  is  the  vector  of  the  oynaptic  interconnections  associated  with  the  neuron  and  (he  mnut 
lector  xi/l  01  )  and  :(  )  are.  in  general,  functions  ipossibly  nonlineari  of  the  synaptic  weights  m.  Die  inpni  x.  and 
the  iietiroti  nuiDui  0  In  Kuhonen  s  self-organizaiion  feature  maps  [9].  the  class  of  functions  ol  )  and  *1  )  liiai  Ih- 
considers  are  niemoryless  fuiiction.s. 

To  compare  DIGNET  wiih  Kohonen  s  map*  we  rewrite  equations  (11)  and  ( 12).  by  dropping  the  tinie-<iepenueii>  •' 
for  noiaiional  convenience,  as  I'ollows 

d  d 

c  =  -je-e-x  (18, 

d  =  /(©n  -  ©(Ml). id))  >  0|(2/(©(c,(O.x(l))  ss  imnj©(e^(0. i(())}]  -  1)  ( 19) 


with  out  pm  equation 


a  majt{  Pe, ) 


where  the  iiiaximuni  is  taken  over  ail  cemer-panerns  of  crested  well*  (clusiersi.  and  P  is  ihe  matrix  of  the  siore  j 
patterns  la  iiiairix  wiili  the  stored  patterns  as  rowsi.  By  comparing  equation  1 16)  with  equations  1 18)  and  1 10).  .nui 
identifvinx 


the  Oignei  algonthm  extends  the  class  of  Kohonen  s  feature  maps  by  introducing  memory  in  o(  )  and  *,(  )  .\noilier 
class  of  aigortthms  that  can  learn  to  discriminate  among  a  number  of  different  patterns  (hypotheses),  are  nase.l 
on  the  learning  vector  quantization  (L\'Q)  algorithm  and  the  creation  of  Voronoi  vectors  in  the  pattern  space  !l0!. 
[llj.  However,  the  LVQ  algorithm,  and  derivative  algorithms  from  11.  requires  that  the  number  of  unknown  paiterns 
(hypotheses)  is  precisely  known  a  prion,  much  the  same  way  Kohonen  s  self-organizing  feature  maps  do.  Furthermore 
the  number  of  \'oronoi  vectors  must  be  close  to  the  true  number  of  different  clusters  in  the  pattern  space  For 
convergence,  the  L\'Q  algorithm  mutt  be  initialized  with  the  proper  number  of  Voronoi  vectors  and  initial  conaiiion.s 
that  are  close  to  the  stable  equilibrium  points.  A  modification  of  the  LVQ  algorithm  that  allows  the  adaptive  upoaie 
of  the  \dronoi  vectors  according  to  a  majority  decision  rule  was  proposed  in  [11].  The  modified  L\'Q  algorniim 
avoids  the  instability  of  the  original  L\'Q  algorithm  due  to  bad  initial  condiitons.  but  it  requires  that  the  size  of  the 
Voronoi  cells  remains  small,  thus,  not  really  resolving  the  sensitivity  problem  of  the  algorithm. 

If  the  initial  choice  of  the  Voronoi  vectors  111  the  L\'Q  algorithm  is  inadequate,  there  ts  no  systematic  approach  to 
adaptivfiv  chance  their  ntuiiber  as  needed  Convergence  of  the  LVQ  algorithm  depends  on  the  proper  choice  of  tii- 
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\bronoi  veeton  tad  initialuatioa  of  the  algorithm  ciom*  to  the  actual  stable  point.  Convergence  of  the  Kohoaeai 
feature  mapa  depends  on  the  choice  of  o(  )  and  *}!  \  functions-  wnich  are  otherwise  arburar>'.  in  that  sense,  neither 
the  LV'Q  algorithm  nor  Kohonen  a  feature  maps  are  truly  seif-organtzing  in  the  sense  defined  by  Oignet.  since  the 
number  of  different  patums  need  to  be  known  a-priori.  and  convergence  is  sensitive  to  the  choice  of  initial  condiuons. 
In  that  respect,  the  guaranteed  convergence  of  the  Oignet  algorithm  to  a  number  of  stable  classes,  given  noisy  dau 
from  an  unknown  number  of  unknown  paturns  represents  the  novelty  of  the  algorithm  that  differentiates  if  from  the 
L\'Q  algorithm  and  Kohonen  s  feature  maps. 

7  Capacity  of  Dignet 

Determination  of  the  maximum  capacity  on  Oignet  to  store  patterns  unambiguously  depends  on  the  tnetne  that 
IS  used  in  the  well  formation,  the  dimensionality  of  the  paturns.  and  their  uparation  from  each  other  in  the  tiisinri 
of  noise.  The  maxumim  capacity  of  Dignet  to  store  input  paturns  unambiguously  depends  on  the  maximum  amouat 
of  tolerable  deformation,  which  depends  on  the  prescribed  SNR.  and  the  initial  separation  of  the  patterns.  The 
capacity  of  Oignet  when  the  metric  ( 1)  is  used  in  the  well  formation  is  discussed  next. 

For  n-dimensional  input  patterns,  aasuimng  that  the  uparation  between  paturns  is  eoual  to  6n  =  arccosi thresh), 
where  thresh  =  1 1  -f  o*  )~t  with  o*  =  the  noiu  variance  and  dn  is  measured  in  rsdians.  an  approximatioii 

of  the  maximum  capacity  of  Dignet  is  given  by 


if  a  pattern  and  us  negative  are  indistinguishable,  and  by 

when  a  pattern  is  distinct  from  its  negative. 

The  maximum  number  of  unambiguous  classes  that  Oignet  can  create  increases  within  the  dimensiooality  of 
stored  paturns.  since  the  number  is  proportional  to  ratio  of  the  surface  of  the  hypersphere  where  the  well  untets 
are  situated  to  the  surface  occupied  by  the  width  of  a  well.  The  estimates  on  the  maumum  capacity  of  Dignet  arc 
thus  obtained  by  comparing  the  area  of  the  surface  of  the  n>dimcnstonal  sphere  with  the  area  of  the  hyperdoma  of 
solid  angle  6o  .Notice  that  this  capacity  can  be  much  higher  than  the  capacity  of  conventional  neural  networks  and 
It  IS  limited  only  by  the  nunimum  desired  distance  between  e.xemplars  that  is  dictated  by  the  amount  of  none  that 
the  network  is  required  to  be  able  to  tolerate.  The  advantage  of  Diinet  lies.  thus,  on  lU  ability  to  create  classes  with 
prespecified  noiu  tolerance.  For  e.\ample.  for  tolerance  to  SNR  s  U  db.  =  1.  thresh  =  which  corresponds  to 
00  =  7/4-  and  thus  C'n  =  2"“'  for  indistinguishable  negative  from  positive  paturns.  snd  4'*'’*  for  duunci  posiuvc 
from  negative  patterns.  Hence,  for  0  db  SNR.  the  well  width  should  be  set  at  90*’  which  corresponds  to  a  threshold 
of  45”.  For  tolerance  to  SNR  =  24db.  the  well  width  drops  to  26”.  which  corresponds  to  threshold  of  only  13”.  which 
yields  a  lower  bound  on  the  maximum  capacity  of  Dignet  equal  to  6.67"'‘  or  IS-Sd"''  depending  on  whether  the 
Dignet  is  designed  to  be  insensitive  to  orientation  of  not. 

8  Implementation  of  Dignet 

An  implementation  of  Dignet  is  shown  Khematicslly  in  Fig.3.  The  different  input  paturns  are  represented  by 
vectors  that  are  stored  directly  as  rows  of  the  matrix  P.  The  vectors  are  first  normalized  to  render  the  recogmtioB 
and  classification  abilities  of  the  network  insensitive  to  magnitude  variations  in  input  paturns.  Since  Dignet  may 
be  used  for  recognition  ana  classification,  the  network  must  be  independent  of  the  relative  level  of  inunsity  in  the 
input  patterns.  .Normalization  of  the  input  paturns  creates  equivalence  claiMS  between  collinear  paturns. 

Once  an  input  pattern  is  presenud  in  Dignet.  it  is  first  sampled,  and  the  samples  vector  x  is  nornulised.  The 
product  Pz  IS  formed  and  then  passed  through  a  vector  threshold  function  /,(  ).  The  sample>and-hold  operatioa 
prevenu  any  input  change  during  lenrning.  Each  element  of  the  product  v  =  Pc  is  equal  to  the  inner  product 
between  x  and  the  stored  exemplars  (matnx  rows)  in  the  matrix  P.  Each  element  of  the  threshold  vector  funcuon 
/,(  )  equals  the  maximum  tolerable  SNR  between  a  paturn  and  the  corrupting  noise  expressed  in  radians  betwceo 
the  stored  pattens  and  the  nominal  patten.  The  condition  for  paaaing  the  threshold  is  equivalent  to  the  input  bcu4 
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wiiluA  an  Mflc  at  moat  equal  to  arccoai  ihreth  i 
i>qual  to: 

where  9,  is  the  threshold  for  the  i-ih  exemplar. 

Hence,  an  input  falls  within  a  well  with  center  some  exemplar,  if  the  threshold  is  exceeded  for  this  cxcnplar. 
Notice  that  the  above  threshold  function  mninsnins  th«  siga.  so  that  two  patterns  with  the  same  magmuids  but 
opposiM  sign  will  be  claasifted  as  different  patwrn.  e.g.  the  network  will  preserve  oncntauon  by  differentiating  between 
black-and-white  from  white-and-black.  If  ptoservasioo  of  tho  sign  is  nos  importut.  w  can  be  replaced  by  |w| 
m  the  inequalities  in  the  thresholding  operation.  AfUr  thresholding,  the  output  vector  is  fed  into  a  manci  (3)  which 
leiecu  the  maximum  threahoided  output,  i.e.  the  exemplar  thu  u  closest  to  the  input  pattern.  Thus,  recognition 
It  achieved.  Claasiftcation  u  achieved  by  fonmng  the  inner  product  between  the  output  of  maxnet  and  the  row 

vector  .\'  :=  (  1  2  3 _ V  . . .  If  a  pattern  u  not  recognised,  the  outpuu  of  maxnet  are  all  scro  and  the  XOR  gate 

becomes  high,  thus  enabling  learning  of  a  new  pattern.  During  the  learmng  of  a  new  patum.  the  "choose  available 
siot  function  selecu  the  Arat  unoccupied  row  of  matrix  P  to  store  the  new  input  pattern,  thus  creating  a  new  well 
with  cenur  the  new  input  pattern,  depth  do.  and  width  equal  to  the  threshold  angle  6,  (6t  =  arccoslgi)).  If  one 
of  the  outpuu  of  the  maxnet  is  high,  thu  indicates  that  the  input  pattern  has  fallen  in  one.  or  more  than  one.  of 
I  be  attraction  regions  of  the  existing  wells.  In  this  case  training  of  the  mauix  P  takes  place  by  updating  the  center 
and  the  depth  of  all  the  wells  that  have  nontero  threshold  output.  Furthermore,  the  stage-age  (s,a.|  of  all  wells 
IS  examined,  and  wells  that  do  not  meet  the  stage-age  requirement  arc  eliminated,  thus  freeing  the  row  (riel)  they 
occupied  in  the  storage  matrix. 


from  an  exemplar.  The  i-ih  element  of  the  threshold  function  is 

r  0  ifo<u'.  <j. 

\  if  9,  <  U-,  <  I 


9  Character  recognition 

The  ability  of  Dignet  to  self-orgaaixe  in  the  correct  number  of  classes  according  to  the  number  of  different  classes 
of  patterns  in  the  input  was  tested  using  noisy  letter  characters  and  sinusoidal  signals  imbedded  in  noise.  Eight 
64x64  pixel,  binary  charaewrs  were  chosen  at  random.  Each  character  was  reduced  into  a  4x4  character  using  a 
16x16  template,  averaging  the  pixel  values  over  it.  and  then  normalising  the  resulting  vector.  Thus,  each  character 
was  represented  by  a  1x16  normalised  vector  .Noise  was  added  to  each  pixel  of  the  16x1  vector  from  a  zero  mean. 
Gaussian  distribution  with  variance  detenmned  by  a  presenbed  SNR.  The  noise  variance  was  c'/n.  with  n  s  16 
and  where  the  S.NR  is  in  dbs.  The  stage  age  (s.a.)  wu  taken  to  be  three  for  these  simulations 

Simulation  resulu  with  two  different  SNRs.  oOdb  and  24db.  arc  shosm  in  Agutes  3  and  6. 

In  both  cases.  Dignet  was  able  to  self-organue  into  the  correct  number  of  patterns,  eight  in  this  case.  The  3-D 
plou  in  Figures  3  and  6.  demonstrate  the  creation  of  wells  (classesi  during  the  self-organisation  of  Dignet  ano  are 
recorded  according  to  the  well  depth.  For  oOdbs  very  few  spunous  wells  are  generateO  and  survived  the  stage  age 
However,  the  number  of  spunous  wells  increased  as  the  SNR  decreased,  along  with  their  average  lifetime.  For  both 
cases.  Dignet  was  able  to  classify  the  eight  different  input  patterns  into  eight  different  classes  (welb). 

In  Fig.  7  the  history  of  the  center  of  a  well  is  being  recorded  as  a  function  of  the  deviauon  of  the  center  of  the  well 
from  the  pattern  that  it  represenu.  The  crosses  represent  the  distance  of  (angle  between)  the  well  center  associated 
with  each  input  pattern  firm  the  nommal  pattern,  and  is  measured  in  degrees.  The  squares  are  the  data  poinu  and 
represent  the  distance  of  an  input  paturn  from  the  nominal  pattern.  The  well  width  (threshold)  for  this  particular 
case  IS  set  at  13°.  commensurate  with  the  24db  SNR.  Various  spunous  wells  are  created  dunng  the  self-organisation. 
However,  only  the  center  of  one  well  gets  reinforced  and  converges  to  the  true  pattern.  lU  center  distance  from  the 
nominal  pattern  approaching  zero,  whereas  all  other  spunous  wells  get  eliminated.  Similar  picture  is  obtained  when 
different  characters  are  presented  alumately. 

10  Detection  of  unknown  number  of  unknown  signals 

An  experiment  was  conducted  using  eight  cosines  with  integer  frequencies  one  through  eight.  Each  cosine  was 
sampled  at  the  .Niquist  rate  of  the  highest  frequency.  Thus,  sample  vectors  of  size  1x16  were  ^generated.  .\t  each 
element  of  the  vectors  noise  was  added  from  a  zero  mean.  Gaussian  distribution  with  variance  ff*/n  with  n  =  16  and 
(T-  =  determined  by  the  spcciAed  SNR.  From  Agure  6  it  is  seen  that  Dignet  is  capable  of  self-organuation 

m  the  correct  number  of  signals  for  SNR  5  and  0  dbs  with  a  limited  number  of  spunous  classes.  However,  for  -5 
dbs.  the  number  of  spunous  classes  increases,  their  life  expectancy  increases,  and  the  resolution  of  the  correct  cisases 
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Figure  5:  Space-iime  history  of  w«ll>crestion  for  eight  different  charsetera  at  SOdb  SNR 


CHARACTERS.  SNR  s  24  db 


Figure  6:  Spacc>tiine  history  of  well-creation  for  eight  different  characters  at  24db  SNR 
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Figure  7:  Convergence  of  a  well  cenur  m  Oignet  for  one  chnneter  at  24  SNR 

dMreaaeo.  Fig.Q  A  stnular  picture  la  obtained  by  tracking  the  center  of  the  welia  for  10  db  in  figure  iO.  Cbvioualy, 
only  the  center  of  one  well  convergea  to  the  true  input  coaine.  Similar  picture  ia  obtained  when  coauca  of  diflerent 
frequency  are  presented. 

11  Topology  of  Multitcnaor  FVttion  Using  DIGNET 

In  (121.  (13|.  (Hi.  (151  (151  «<*  (I"?!  Neymaa-Pearsoo  (N-P>  ibnoty  for  the  Diainbuted  Dcciaion 

Makug  (ODM)  problem  waa  develop^.  In  (18)  it  waa  shown  that  there  enau  a  oa»>tOM>fM  topologieai  cor- 
raapondnsca  between  the  Bayesian  and  N-P  a^ution  of  the  DDM  problem  and  neural  networks.  Furthermore  it 
was  shown  that  neural  networks  exhibit  Receiver  Operaung  Cbaractcnatics  (ROC)  that  are  cioa«  to  the  optimal 
Likelihood  Ratio  Teat  (LRT)  ROC.  when  trained  with  the  proper  traming  rule  (19]. 

It  has  been  shown  that  the  OIGNET  can  be  successfully  used  for  detection  of  unknown  number  of  unknown 
patterns.  In  ihu  section  a  topology  is  proposed  for  using  Oignet  in  Multisenaor  Detection. 

Figure  1 1  shows  an  implementation  of  the  parallel  fusion  scheme  of  (201.  urmS  Oigneu.  The  signal  received 
by  each  sensor  is  fed  to  a  Oignet  (possibly  after  some  preptrreasing),  where  the  closest  stored  signal  (pattern)  is 
recognized  and  appears  at  the  output  of  the  s-th  sensor  >if.uei  (fig.  3)  as  a  vector  P$.  A  weighted  average  of 
the  outpuu  (see  below)  is  then  fed  to  the  Oignet  of  the  fusioo  cenwr.  which  is  used  as  a  classifier  (only  output  ‘c" 
in  fig.  3  is  used). 

Along  with  the  vector  outpuu  of  the  sensor  Oigneu.  the  wcM  depths  of  the  recognised  patums  are  fed  to  the 
weighted  average  stage  where  they  are  used  as  the  weighting  factors: 

Fi„sf^d.P.  (25) 

where  F,„  is  the  input  vector  of  the  funon  center,  m  is  the  number  of  senson  and  P,  and  d,  are  the  output  vectors 
and  depths  of  the  sensor  Oigneu. 

A  deep  well  is  a  well  that  has  -recognised”  many  patterns  and  a  shallow  weii  is  one  that  most  probably  is  spurious 
(created  by  some  esilicr  signal  or  by  pure  noise).  This  moiivatee  the  above  topology  where  a  recognued  pattern  with 
a  deeper  well  is  taken  into  considcratioo  more  than  another  of  kse  depth.  In  practice,  this  means  that  since  a  sensor 
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with  highar  SNR  producaa  dcaper  walla  than  anoihar  with  lower  SNR.  lU  output  is  talian  into  higher  consideration 

by  the  fuawa  Oigaet.  thus,  the  Oigaet  fuaioa  topology  has  built-in  fault  tolerance.  ^ 

The  binary  caae  itwo  aigaaisl  is  a  special  case  of  the  caee  of  unknown  number  of  unknown  signals.  Furthermore, 
m  the  Radar  detection  problem  there  is  only  one  signal.  Hypotheaia  // 1  cormponda  to  the  preaense  of  signal  plus 
noMe  and  hypothaaM  No  coneeponda  to  the  proaaaae  of  only  notae.  Without  noiae  the  abaaace  of  signal  would  result 
in  a  seio  pattern  vector  which  is  a  singularity  in  the  director  space,  since  it  cannot  be  mapped  on  the  surface  of  unity 
sphere  (it  cannot  be  normaliaad).  In  order  for  the  Dignet  to  function  in  that  case  the  lero  vectors  must  be  ignored 
(neither  recognition  nor  training  is  performed).  I  # 

An  alternative  approach  is  to  map  the  lero  vector  by  convention  to  some  other  vector.  This  is  valid  only  if  the 
signal  is  known  so  t^  the  choice  of  a  dilEetent  director  is  possible  for  the  mapping  of  the  taro  vector. 

In  the  applicatMO  of  Oignet  on  the  multisensor  radar  detection  problem  the  first  approach  was  used.  i.e.  the  tero 
vector  was  ignored.  It  is  thus  expected  that  the  signal  will  cteau  a  single  “deep”  well  corresponduif  to  N  i  and  in 
the  absence  of  sigiwl.  the  noias.  having  random  direction,  is  mapped  on  the  surface  of  the  unity  sphere  in  such  a  way 
that  no  matter  what  is  the  noise  dmbution.  the  distrihutwo  on  the  surface  is  sas/enn.  at  least  in  the  Gauaaian  I 
noise  case.  This  cauaas  shallow  wvlis  to  be  created  (uniformly)  on  the  surface  and  disappear  after  very  tort  tune. 

For  the  following  experimaott  a  cosine  was  sampled  at  the  Niquiat  sample  raw  and  Gaussian  noise  was  added  to 
the  sampled  vector  dement  by  dement,  as  in  section  10. 

In  fignrss  12  and  13  the  threshold  is  .70  and  .85  rsspectivdy  and  Pr  and  Po  are  plotted  vs  SNR.  We  notice  that 
Pf  assumes  a  minimum  value  and  it  cannot  decrease  further  no  matter  how  high  the  SNR  is. 

In  figure  U  the  SNR  suys  constant  at  -30  db  and  the  /’r  and  /*o  arc  plotted  w.r.t.  thrsshold.  As  expected  they 
both  decrease  as  the  thtsshoM  increases  and  the  wdl  becomes  smaller. 

The  case  of  unequal  SNRs  is  tasted  in  figure  15.  Initially  the  SNR  is  0  db  (equal  for  both  sensors).  Pr  and  Po 
are  plotted  w.r.t.  time  for  10*  time  dott.  At  time  I  s  .5  x  10*  the  first  sensor  breaks  down  and  its  SNR  becomes 
•60  db. 

There  is  no  noticeable  effiset  of  the  sensor  malfunction  in  the  graph.  The  very  high  none  of  the  broken  tensor  ^ 
causes  miscisssification  but  the  weighting  stage  (figure  3)  causes  the  fuaon  to  igaoce  the  tensor  s  output.  The  ripple 
in  the  Pr  curve  is  due  to  the  small  number  of  time  samples. 

In  figure  16  the  Reedver  Operatug  Characteristic  is  given  for  SNR  -30. 

In  figwe  17  the  case  for  SNR  a  -30  is  shown  for  a  3  sensor  fusion.  The  contsponding  R.  0.  C.  is  shown  m 
figure  18. 

In  figure  19  the  case  for  SNR  =  -30  is  shown  for  a  4  sensor  fusion.  The  corresponding  R.  0.  C  is  shown  m  » 
figure  20. 

We  notice  that  incitnaiag  the  number  of  sensors  increases  the  Pd»  fw  ihe  same  Pr,  For  example  for  Pr,  =  003. 
with  2  sensors  Po,  w  0.875.  with  3  temors  Po,  >  0.95  and  with  4  sensors  Po,  —  0.98. 
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Figure  U:  ProbabilitiM  of  detcctioii  and  falie  aiarni  u  functions  of  thnshoid  (2  senson)  for  SNR  -30  db 
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Figure  15:  Probabilities  of  detection  and  false  alarm  at  functions  of  time  (2  sensors )  for  SNR  0  db.  At  timei 
(he  first  sensor  breaks  down  and  its  SNR  becomes  -fiOdb. 


Figure  16:  Receiver  Operating  Characteristic  for  Gausiiaa  noise  and  SNR  s  -30db. 
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Figutt  20:  Receiver  Opermung  Ch&rMieriftie.  GuieHU  noiae.  SNR  s  -.10  dB.  4  lenaore 

12  Conclusions 

A  new  aruliciai  neursi  network.  OtG.NRT.  uae  introdueeo  for  nutomnuc  pntum  recognition  end  cle«i&ceuon. 
The  proponed  ANN  exhibiu  neif-orgenueiion  cepebiiittee  eccording  to  prescribed  tolerance  to  noise  interference,  end 
neuron  requirements  that  grow  lineerlv  with  the  sue  end  the  number  of  petterns  ihet  ere  needed  to  be  stored,  it  is 
shown  that  the  seif-orgaaisauon  algonthm  of  Dignet  leads  to  stable  ciassas  ihat  ere  created  around  petterns  that  ere 
susiained  in  the  input  data  over  time.  Oignet  was  tested  successfully  with  pattern  clasaiAcation  and  signal  detection 
paradigms. 

•A  sensor  fusion  topology  using  OIGNETS  was  introduced  end  numeneel  results,  for  Gaussian  additive  noise 
showed  that  Oignet  performs  well  under  unknown  statistical  environmenu. 

References 

[IJ  John  J.  Hopheld.  .Veural  networks  and  physicai  systems  with  emergcni  eoUective  computational  abilities.  Pn~ 
recdisfs  0/  StUouti  Acedemy  of  5ctenec.  79:2SS4>25S8.  Apnl  1982. 

^2]  J.  L.  .McClelland.  0.  C.  Rumdhart.  and  the  POP  Group.  ParaiUl  Dutn^mtti  Preccsfinf,  \‘oL  J  &  £.  The  MIT 
Press.  Cambridge,  .MA.  1987. 

[3]  Richard  P.  Lippmann.  .An  introduction  to  computing  with  neural  neu.  IEEE  A5SP  MAGAZINE,  pages  4-22. 
Apnl  1987. 

[4}  DARPA  neural  network  study.  Oct.  1987  -  Feb.  1988.  Executive  Summarv,  Lincoln  Lab..  .MIT.  Lexington.  .MA. 
1987. 

[3]  Stephen  Groasbsrg.  Nonlinear  neural  networks:  Principles.  mechaniamB  and  architectures.  In  Arersf  Nttmorkt. 
VoL  1.  pages  17-81.  Pergamon  Press.  1988. 

[8]  Lei  Zhang  and  S.  C.  A.  Thomopoulos.  Neural  network  implementation  of  the  shortest  path  algorithm  for  traffic 
routing  in  communication  networks.  International  Conference  on  Artificial  Neural  Networks,  poster  paper.  June 
1989. 

[7]  S.  C.  A.  Thomopoulos.  L.  Zhang,  and  C.  D.  VVann.  Neural  network  implementation  of  the  shortest  path 
algorithm  for  tra^  routing  in  communication  networks.  In  Alkrtom  Coa/crrnce.  October  1990. 

[8]  S.  C.  A.  Thomopoulos  and  Dinutrioe  K.  Bougoulias.  OIGNET:  .A  self*otgaaising  neural  network  for  automatic 
pattern  recognition  and  classification.  SPIE  conference  on  sensor  fusion.  Boston.  MA.  to  appear.  .November 
1991. 


494tSPIt¥oL  fail  Sensor  Fusion  iV  0991) 


» 


'Jj  T«uno  Kohonen.  State  of  the  art  tn  aeurai  compittini.  In  IEEE  First  IntsmsUonsi  Conference  on  .\'euni 
\etworks.  pag«  i-79-1-90.  19<i7. 

'10)  R  0  Dudaaad  P.  E.  Han.  Pattern  Claut^cation  and  Sekent  Anaifsu.  J  Wiley  k  Sou.  New  ^'ork.  1973. 

[ll|  J  S  Balaa  aod  A.  LaVtgna.  Convergence  of  a  neural  network  claaaifter.  In  Preceekia#*  of  i9ti  CDC.  Honohti*. 
Hawn,  pages  1735-1740.  December  1990 

[12|  S  C  A.  Thomopouloe.  R.  Viswanathan.  and  Oinuirios  k.  Bougoulias.  Optimal  decisioa  fuaioa  in  multiple 
sensor  systems.  In  Procaadtmfs  of  the  d4ih  Al/erioa  Coa/ersace.  .l/eaucelle.  IL.  Oct  1-3.  pages  954-993.  19M. 

[13|  S  C.  A.  Thomopouloe.  R.  \'iswanatban.  aod  Dimitrios  K.  Bougoulias.  Optimal  ^isioo  fusioa  m  imiluple 
sensor  systems.  IEEE  Trt*aaettont  on  Aerospace  and  EUcinntc  iysteau.  23:544-053.  September  1987. 

[  14)  S.  C  .A.  Thomopoulos.  R.  Viswanathan.  and  D.  K.  Bougoulias.  Optimal  and  suboptimal  distributed  decisioa 
fusion.  Technical  Report  TR-Sir-DCE-87-.i.  Department  of  Electrical  Engmeenag,  Southern  Illinois  Universiiy. 
Carbondale.  IL.  August  1987 

[15|  S.  C  A.  Thomopoulos.  Dimitnos  K.  Bougoulias.  and  Lei  Zhang.  Optimal  and  suboptimal  distributed  decision 
fusion.  In  SPIE  Teckutcal  Spmpostam  on  Optics.  Electro-Optics,  and  Sensors.  Apr.  4-S.  Orlando.  FL.  1988. 

:16|  S.  C  -A.  Thomopoulos.  R.  \’iswanathan.  and  Dimitnos  K.  Bougoulias.  Optimal  and  suboptimal  destributed 
decision  fusion,  in  iSni  Aaeael  Conferance  oa  Information  sciences  sad  Sfsiems.  Princeton  O'nioersitp.  AV. 
.March  16-18.  pages  885-890.  1988. 

[17]  S.  C.  .A.  Thomopoulos.  R.  N'iswanathan.  and  Dimitnos  K.  Bougoulias.  Optimal  distributed  decision  fusion. 
IEEE  rrsasaciieas  ea  Aerospace  and  Electronic  Systems.  25:761-765,  September  1989. 

[18]  S.  C.  A.  Thomopoulos.  Decision  and  evidence  fusion  in  sensor  integration.  In  Advances  la  Control  and  Dynamic 
Systems.  Academic  Press.  .November  1991.  Volumes  45.  46.  47  and  48.  to  appear. 

[19]  S  C.  A.  Thomopoulos.  1.  Pappadakis.  H.  Sahinoglou.  and  D.  Bougoulias.  Centralised  and  distributed  decision 
making  with  structured  adaptive  networks,  perceptron  like  and  self-organising  neural  networks.  In  Data  Fusion 
IS  Hokoties  and  Uaekint  Intelliyence.  Acadetmc  Press.  1992.  To  appear. 

[20]  D.  K.  Bougoulias.  Distrskuted  Dccuiea  Makiny  with  Bayesian  and  ffenml  ffetwork  Approaches.  PhD  thesis. 
Universiiy  of  Southern  Illinois.  Carbondale.  IL.  June  1991. 


I 


9 


» 


9 


i 


SPliVol.  1611  Sanso/ fusion  lvn991)/495 


I  J  C  N  N 


SINGAPORE 


’91 


INTERNATIONAL  JOINT 
CONFERENCE  ON 
NELHAL  NETWORKS 


A 

S  ''TAM FORI)  , 

\M  ' 

A  E^:i“ 

\  PLAZA 

-  .  *  N 

s)  '\'hMbER  :  ' 

•  ;  N !  N  1  .  : 

•  •  •  • 


NantAL  NETWOMC IMPLEMEKTATION  OF  TTa  SHORTOT  F  ATH  ALGORITHM 
FOR  TRAFFIC  ROLITING  IN  COMMUNICATION  NETWORKS 


SwUm  CA.  ThMMpaalM^'*,  L«i  Awf*.  CkM  Oar  Wim* 

*  OabNon  wd  CsaM  SyMHM  LakofMory 
OnpHOMM  of  Etecatol  A  CooipiMv  EnpMmi 
TlMPHiiMyivaiuaStUiUnivmy,  UMvmiy  PwK  FA  1MQ2 

*DipwaMMi  M  EImkM  Sa^Mms 
Hw  Univ— ty  at  Maryland.  Cnllip  FwK  MD  20740 

AbMM 

A  Murai  i»a«i»»orli  eanputadon  algondim  la  iamduead  lo  aelwt  ter  dia  opdwal  nffic  touang  in  a  goMrai  N-nodc 
cnwimmicadow  nawaatk.  ThaalfanttwidwaawimilBHnkpadwteiwnda  WBOdaBaMcnnhidiniiiu—aaacarmncoaihiticaac 
(a.g.  arpactad  dalay).  UnUka  dia  algoridiai  laaoduead  aariter  »  dda  aiaa,  Intoadadgi  ot  dM  auaibar  of  Uaka  (hopa)  batwasi 
aadi  otigai  dradnadaw  pair  la  aot  laquuad  by  dta  algartthoi.  dwaateia  it  caa  ba  appBad  to  a  Biota  vanabia  langdi  path 
roudag  pwNia.  Tha  naural  natawrb  amieaata  ter  impteaiidiig  dia  algortdMi  ia  a  mndtHraitea  to  dia  ana  uaad  by  die 
Tf  avateig  Sateaaiaa  algeiidiai.  CoatpuiartiBiidadoBauiaiiaaa-aadate— >^M»dapldBa^laet^lahBwdla^dlaaigor^dunparterB^a 
txaaoiaiy  wall  bi  aagla  and  aaiidpte  padia. 

L  imradocliaa 

Tlia  conputadanal  poiatr  and  tha  dpaad  ol  ooUaedva  analog  nattaorha  at  naanna  ia  aolvdig  opdsiizadon  problana  hava 
baan  doaoiiattatad  by  Hopfiald  and  Tank  (IH31  dBOugh  tha  laBMua  Ttavallag  *ialnan  Ptobtean'.  A  wmlar  procaduia  on  ba 
appliad  to  tolva  a  nunibar  ot  optimiiadon  ptoblaBii  (6].  In  ordar  to  aolva  a  pracdcal  opdiBnahnn  proMam  uaing  a  naural 
narwork  tttuciura.  it  ia  naoaaaary  to  find  atgonlhaia  ter  datarauaing  tha  eonnaetiona  and  waighia  at  tha  naural  nacwork  to 
that  it  eonvargaa  to  tha  apptopnata  anatatr.  In  thia  papar,  wa  suggaat  a  noural  natwork  ttructura  that  an  dattmuna  tha 
opoBial  routo  ter  noda-tMioda  Baifie  in  an  N-noda  coBiBiuniadeci  natwork.  Tha  toitenira  la  an  unplamanMori  o<  dia  to 
irallad  'Shortat  Path  Roudng  Algondtai'  in  whldi  a  rouia  »  aaiacM  ter  ovary  ongindoatinancri  (OO)  pair  such  chat  tha 
trananaaioa  coat  at  Biinunuod  il  data  ia  BaniBuiiad  along  thia  routn 

Tha  Buun  hmedon  parteraiad  by  a  rouong  algoruhin  ia  tha  tatecdon  ot  routaa  ter  vanoua  anpn-daaonaaan  pairs- 
Than  ara  two  aialn  partermanoa  waaauraa  that  ara  aubatandally  altectad  by  tha  raudng  algoridiai.  tha  throughput  iquanniy  ol 
tarviot)  and  tha  avanga  dalay  (quality  ot  aarviea).  A  good  roudng  algotithm  should  aalact  tha  routaa  which  hava  miniaiuni 
avaraga  dalay  (thus  allow  oion  Baific  into  tha  natwork)-  In  tha  ahortaat  path  algonthsi,  a  coat  ta  ttaonatad  with  avary 
link  in  tha  natwork.  In  moat  caaaa.  tha  cost  is  pfopornonal  to  tha  dalaya.  Tha  objacdva  is  lo  find  a  muiolink  path 
yiuiing  two  nodaa  that  has  BununuBi  total  coat  Diftertnt  impUtnantadciia  ot  tha  shotft  paths  algonthai.  in  both  synchronous 
and  aaynchranoua  laahlon.  art  avaitabla  (4|.  In  thia  papar  wa  eonttdn  two  difiarant  NN  iaiplamaniaBaiis  ol  tha  shonaat  paths 
algonthai  usuig  tha  actual  dday  and  tha  dvivadva  dalay  as  cost  hmcdona-  Tha  naural  nattrork  tiruciura  ol  the  algondioi 
was  first  introduoad  by  Rauch  and  Winanka  (SI-  Thair  mathod.  hosvavar.  hat  latious  limitadons.  It  can  find  tha  shorttat 
path  for  a  ^van  OO  pair  only  whan  tha  numbar  of  links  that  tha  path  contains  is  known,  which  is  an  unraallsdc  atsumpoan.  A 
Biodifiad  structuta  is  suggBitad  in  tha  pwMBtpapw  so  that  tha  algpntfaBi  can  work  far  arbiBiry  and  unknown  nuBibarol  links 
in  s  givan  Ob  pair.  ThaNNthatttprtawtadinthlspaptrwiafatstinBoducadinlT). 

IL  PtoblaBi  Statamant 

Conaidar  a  N*noda  narwork  and  aaauoia  that  tha  connacdvity  of  tha  natwork  is  taiown.  Lat  c^.  dtnotr  tha  capaaiy  of  tha 
link  connacBng  noda  i  with  noda ).  If  thara  is  no  diract  oonnacdon  baesvaan  nods  i  and  t  c..  *  0.  Tharalora  tha  narwork 
can  ba  daambad  by  an  NxN  capadty  matrii  C  with  antnaa  c...  In  addldon.  if  tvary  link  in  tea  natwork  is  a  two-way  imk 
and  has  tea  aama  capacity  in  aach  diracdon,  C  is  systmatiic 

Our  ptoMam  is  to  find  tha  path  coniwcdiig  origin  and  daadnadoti  nodaa  teat  BiintBiizas  a  coat  function  such  as  tee 
cxpactad  delay.  Sinci  tea  arpaeisd  delay  aaoaa  a  link  is  a  funedon  of  tha  link  capaaty  c..  and  tha  actual  link  traffic 

f several  funcdoiia  can  ba  uaad  to  cNeulata  tha  link  cost  (4L  [5].  For  exampla,  tha  link  coat  w. .  can  ba  datereuned  by 
w.-f-+ (If../(c.,-Xf.Jl  P  1)1) 

I)  0  ‘  i|  I)  i| 

whara  f^  is  the  transBiiaaMn  dsia  far  each  link,  and  £f..  is  tea  total  flow  fron  all  OD-pwis  on  tea  link  ij-  The  exponent  p 

can  taka  any  pcaidva  value,  but  ceoiaioiily  UMd  valuas  ara  I  or  2.  The  value  p  >  1  was  uaad  in  tec  sunuladons-  The  Unx 
capaaty  c..  •  £  f..  in  (1.1)  is  tea  residual  capaaty  in  the  network  whan  pates  for  muldple  OOpairs  arc  considered,  vvhen 

tec  opomal  paths  ter  muldpla  OO-pairs  ara  datarBiinod  soquondally,  the  roBduat  link  capaaty  is  datarBunad  by 

c..  -  £  f^i  ■  c..  •  £  fjpravious  OO  pairs)  •  £  f. .(currant  OO-pair)  (1.2) 

With  p  >  1  in  (1.1),  two  diffarant  approaches  wars  takan  to  solve  for  tea  shoitost  path.  In  the  first  approach,  which 
will  be  rotarad  to  as  delay  cast  approach,  tha  link  cost  was  computad  dirsedy  using  (1.1).  In  tec  saoond  approach,  which 
will  ba  rafatrad  to  as  arrnatise  delay  eosf  approach,  the  link  coat  was  equated  to  tec  darivaavc  of  w. .  in  (1 .1 ),  i.c. 
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MlMch  If  link  COM  itwi  it  uftd  la  tk*  eanvfnaaiial,  opaatuii  lolutwR  a<  th«  ikoraM  puk  prabkoi  muniMig  convex  ind 
Iniihli  liWwkiMf  May  (com)  huoMn  M-  The  diteenoM  in  the  nutnical  foluhonf  obtaued  undw  the  t«M>  ooai 
hmcBOM  (1.1)  aad  (I-3>  fee  dtacuafad  iaihe  aBwlaaona. 

Ui  the  NxN  amBii  ^  with  wOMa  w^.  denote  the  odm  ombu  teenoated  with  the  network.  Nonce  that  li  there  le  no 


duaei  bah 


V  w. 


■  (c..  ■  0).  U  thif  la  lha 


a  very  Urp  auaiher  if 


10  w^.  la  the 


To  dhifltaia  the  prnhleiii.  eonaldar  the  S-noda  network  la  ngure  1.  The  aumhar  baaide  each  Uak  ripieaeiiia  the 
conoapoadiag  link  coat  (w.^.  The  eoat  aiaBU  W  eeeocMierl  with  thla  network  la  pvaa  in  Table  1,  where  L  ia  loaw  largf 

poaiava  nuaibar. 


TaMal  Cart  Matrix  (or  the  5>Node  Network  of  Fig.  1 
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The  thottaat  path  (rota  node  I  to  node  S  if  obvioufly  l-2->S  and  the  muuaium  total  ooat  if  w^r  '*'23*  '*'35*  **’^*^  *^ 


In  the  next  faction  we  teteinulate  the  ihorteet  path  algoruhaiufing  a  neural  network  ftnictuie. 

10.  Nawal  Network  CoeaptuadariAJgarilhm 

In  their  paper  (SL  Rauch  and  Wlaanke  -iggn  il  that  the  eoluhea  M  the  thorteat  path  algonthai  eta  be  repreaeniad  by 
a  2-di8ieBaHnal  nauton  enay  V  ■  with  each  output  at  the  neuron  ta  the  amy  haviBg  valiat  ■  0  or  1.  The  auaabtr  at 

rowe  ia  the  amy  ia  oqual  to  N,  tha  number  o(  nodea  in  the  network,  tad  the  niaaber  at  coluaiaa  if  equal  to  tha  number  o(  nodaa 
that  the  path  eontame.  For  the  Vnode  network  ot  Figure  I,  the  thorteat  path  conaaedag  node  1  and  noda  5  can  thua  be 
lepreftniad  by 
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It  If  obviouf  that  for  the  omy  to  repraaent  a  valid  path,  there  can  be  only  one  aoawe  entry  ia  each  ooluma  end  there  can 
bn  at  meat  one  nonaero  enky  m  each  raw  (Udf  conditlnn  w  different  from  the  one  taquirad  by  the  TSP  protaiaan).  An  noaxeto 

msy  in  the  1)^  poaltton  ot  the  amy  can  be  Intwpretod  aa  'node  t  la  the  )tb  node  la  path'.  Uring  thlf  lepreemiMInn,  a 
total  NxM  neurana  are  aaadad  to  repioaait  all  the  patha  having  length  (niaibm  at  aodm  ia  the  path)  M.  Qvea  an  OD  pair,  the 
fixft  and  the  laat  ooluma  at  the  array  are  flxad,  to  there  are  Nx<M>2)  aettvo  neurana  la  the  amy  which  are  free  to  be 
updated 

Af  we  have  mantloaad  in  the  previoua  teetton,  thia  tiprefenahon  haa  ila  IlmltaUcwa.  The  prablm  with  thla 
repreaen  tenon  ia  diat  if  we  do  not  know  how  many  nodaa  the  ahorieat  path  would  contain,  la.  M  la  unknown.  Rauch  and 
VVrnanke  aaaumad  that  tha  muamum  number  of  linka  between  a  given  OO  pav  could  be  obtatnad  la  advance  from  the  c^adty 
tnaow  Cia  which  oaae  Mia  equal  number  phm  one.  Howevei.taychooamgMtklawaiy.weinaiynotbeabletofindtba 

fhottefi  path  beeauae  it  la  pomible  that  a  longer  path  can  have  lower  total  eoat  than  that  of  a  ahettar  path.  In  our  Vnode 
example,  the  minimum  number  of  linka  betwem  node  1  and5ia2.  If  we  ehooaa  Mb  3,  the  5x3  amy  can  only  give  the  path  that 
containa  3  nodaa,  which  ia  l-VS  with  coat  6.  We  know  though  from  tha  previaua  diacuaainn  that  the  thottaat  path  ia  1-2-3-S 
with  coat  4.  It  iaobvioua  that  thaaohition  pven  by  Rauch-Winaraka'a  (R*W)  methodia  not  tha  comet  one. 

To  ovetootna  the  Itmitationa  of  the  R>W  mmhod  we  fix  the  numbm  of  columna  In  tha  amy  at  N,  which  ta  the  moamum 
poeaible  number  of  nodaa  any  path  could  eontamm  an  N-node  network.  By  doing  ao.  tha  naunn  amy  (NxN  now)  can  repraaent 
all  the  patha  oontaming  N  nodaa.  Since  moat  of  the  patha  have  length  leaa  than  N,  we  ahould  eenvert  theae  patha  into  length 
N  patha  and  mnurtw  ihdr  total  coot  at  the  aam  time  in  order  for  them  to  be  lapreoantad  by  the  NxN  array.  Thla  can  be 
achieved  by  ttUtag  came  aera-caet  paanda  Uoka  to  ihaee  "ebarOw''  pathn  U.  paMa  with  leaa  Him  N-Uaki,  antii  tfeeir  lengm 
iaeqnaiiaN.  To  iamlamaer  iMe  ida»  far  each  node  we  mtredner  one  aete  rear  parada  Mat  tharraenirte  fke  aade  U  iudf. 
The  iraffk  can  thm  drda  at  any  noda  along  the  path  through  theae  paeudo  Unka  without  inoaaaiag  tha  total  coat  of  the 
path.  For  the  Vnode  example  of  Rg.  1,  the  network  after  mffoducing  5  paeudo  Unka  la  ahown  In  Rg.  2.  Table  2  givm  the 
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By  muimg  TaU*  1  wUh  Tahlt  2.  om  en  tm  that  iha  only  lilWiHw  bannaai  iht  Mo  coal  matncM  u  tliac  tht 
dufonal  Smimm  aow  baeowa  am  inamd  ct  a  laifi  numtar  L  Urtag  ibia  mdlflad  rapiamuHnB,  ana  ct  tha  poMibla 
MtuttaaawthaSBodaaaMotbptoblamamhnodalbategtNaByiidMdaSibartiailnittBnii 


' 

1 

2 

3 

4 

s 

1 

1 

1 

1 

0 

0 

"T 

2 

1 

0 

0 

1 

0 

0 

3 

t 

0 

0 

0 

1 

0 

4 

1 

0 

0 

0 

0 

0 

S 

1 

0 

0 

0 

0 

1 

(3)  ihowa  lhai  tha  ahoraat  path  banma  noda  1  md  S  ta  l-I>2-a>5,  whkh  m  ba  IntTtaiart  «  l-2-3>S.  Nota  that  dM 
i  iprMnaBBn  of  tha  ihawt  path  ia  not  tangua:  aahabaaa  l-2'2-3-5. 1-2-M^,  and  I'2-^M  alt  iipwn  tha  laaia  path. 

For  a  aohatan  la  ba  vahd.  «m  logidN  that  than  ia  only  oaw  aoman  anoy  ia  aach  mb  arm  and  tha  toad  nuaibar  of 
nonxm  aatrla  ia  tha  amqr  ia  agual  to  M  Uadar  thm  iha  aaaigy  faaciiaa  aaaadaiad  with  tha  aaiwark  can  ba 

(Maadaa 


E  -  (A/2)  Z  Z  Z  V^w„V  ♦  (i/2)  Z  Z  Z  *  (C/2K  Z  Z  V..  N )“ 


kl  j 
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k‘  i 
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‘  I 


(4) 


whan  tha  dm  trtpla  atmaadon  gkata  tha  wial  coat  trooi  dw  cdght  to  tlaaihiatlon.-  dia  aacond  and  diiid  lamia  art  tha 
coaaarniiiaiaipoaadOB  tha  output  OB  tha  aauwB  away  to  aiaka  it  eoBvatgi  to  a  valid  path;  A.BI  and  CatapoartvaanioreaBiani 
fataota. 


di  th 

FfOBiEquattan(4)wacanobtalathaaBnaactlBo  waightbatwaanthai)  nauton  and  Iha  tan  nauton  in  tha  amy 

T.  ».Aw.  (S  ,,♦«  J-B#.  (1-a,  )-C 

ijjgui  im  n,i*’l  n>^I  I"  !■* 


(5) 


whan a.jiathaKroaackar'adalia.ta.  6|^1  if  1^  and  5^>0  if  iaj. 

ih 

Tha  fiata  of  tha  i|  nataon.  y,,  can  ba  doaoihad  by  tha  diffarondat  oguatlon 


dy../dl--y,./l  ♦  Z  ZT.  V  *1. 
'q  'M  _  UAB  aw  ij 

om 


(6) 


and 


''q  •  «<  V  ^  V  ^  ^ 

1..  aCn  (input biaa tom)  (8) 

for  1  sisN,2sj£N-l. 

In  high  goto  Unii,  tha  output  of  Iha  nauroa,  is  doaa  to  0  or  1.  and  Iha  anorgy  (unction  dafinad  by  (4)  wtU  ba 
mwiaiixad  at  ooutd  bo  a  local  nlniaiuai)  whan  Iha  syMan  toochaa  its  aiaady  aiaia. 

Thaiiau»»liiati«Brttfwmi.j«lgMtttiiiiHM«liip«ttinlm— Ifc-imda  natwofks 
With  diftaant  liak  ooot  aaaignatna  Tha  Batoda  gnd  aatwoik  shown  ui  Rgura  3  waa  uaad  for  tht  fiiat  piloi  sunulaonn. 
All  Uaka  waaa  aaaimiad  to  ba  two-way  biika  and  hova  tha  aoma  onoclty.  Uadar  dda  aaatanptlon.  tha  capaoty  maou  C  is 


lyMMK.  M«  alM  mmimmi  liwi  itM  Muitat  Uak  OM  m  tmfmt  propgraonal  le  tha  twk  capaoiy  bwatMi  w«  do  noi 
havd  wy  kaowiadta  about  ito  Uak  aow  (tnttd  wImb  wo  finl  «an  tto  rijanthoi  So,  iha  com  buou  W  w  aiao 

tyMMMMidittttBhalMwoitoMHMeaM.  'Hw  itiagnaal  ttiitial  W.  which  lanoapoiid  »  pawdo  Unto,  aw  aU  fo.  md  i 
lar|i  auBbw  la  aaa^Md  w  dw  aiaMoii  aduA  rcttiionti  lo  "opaa'  liaka  Tlw  eoM  aiaaw  uaad  In  dw  pilot  MaiulaaoM  » 

pvoBatTaUaS. 


n^HoS  A*4iMaGiUNa«woHiwitkPaaadaUakBMMlM4Uaail 
TaUaS  CotiMaariafarika»>bMaCiMNaM»o(ka(n^3 
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la  Iha  pilot  anwilatiaBa  only  oaa  00>palr  waa  eeaaldatad  te  iha  pvaa  coat  ataaix.  For  aach  pvan  OD  pair,  tha  hint 
and  tha  taai  cohawn  la  tha  nauna  array  an  flaad.  Tha  aiaiaa  e<  Iha  laat  N(N4)  (■  (3  la  tha  9-noda  natwotk)  acttva  naurona 
ara  iipdaiart  antnidlin  to  tha  attady  atata  aaptaaaieB  of  EquaBona  (O  Mid  (7).  Tha  Initial  valua  of  tha  output  of  aadi  aoivt 
nauron  a  a  laadooi  nuaibar  uBtfomly  diaiiibuiad  la  10, 2/N|  aueh  that 

E{IIV..  }-N  (9) 

I  j  ^ 

Wa  tort  Iha  natwork  with  low  gain.  i,c,  tha  ilopa  of  tha  hypatholic  laafant  cuiva  In  EquaOon  (7)  la  amall  (y^ 

larpd,  Thia  chotea  would  allow  tha  ayatan  to  flad  batMr  adama  of  tha  tnargy  aurfaot.  Aftar  100  ItMahona,  wa  atari 
•lowly  incraanac  tha  pua  (liaoraaaiin  y^)  uadi  tha  ayataa  oonvarfia  and  tha  valuaa  of  ara  naar  0  or  1.  Tha  raaulia  for 

tha  9«ada  ptlot  aaidy  wata  ohhdaad  with  tha  foUowini  paratnatara. 

Aa2aS«CaS00,aa9i,tal 
y^  a  290  (lalHaU,  yg  «  20  (Baal) 

Tha  alfonthai  la  aantiava  to  thaaa  panaaiara  alaoa  a  bad  oparaBaf  potat  aaay  raaiilt  la  dtvat|anea  (oadllatton). 

Tabla  4  aharw  tha  abortatl  path  found  by  tha  alpotidiai  batwoan  noda  1  and  aoda  9;  (a)  la  tha  Initial  condiboa  (b)  li 
tha  raault  afMr  100  Itarattoaa  aaid  (c)  tha  raaiilt  aflar  200  Ittfatteaa  (Baal  raault),  Tha  ahortaat  path  found  la  l'l-2-2-2- 
2-5-b-9,  which  ea8baiaitip*aiadaBl<2-M4.  TaUaSgivaaMBiilnrraMilliferadiattaBtODpalr. 

Bacauaa  of  Iha  ayaoMMy  of  tha  0d  natwotk.  tha  thottaal  path  la  not  uaiqoa  for  aooia  OT  pain,  which  makaa  tha 
coBvariatiea  awta  dUBfiilt  Tha  alfotithai  will  Bad  oaa  of  tha  ahortaat  palha  dapaadlnp  on  tha  inihai  condaiana.  In  the 
aaiulatwit,  wa  alao  aohead  that  If  tha  pin  ia  ftaad  at  a  higher  vahia  rlghl  bom  tha  bagjnnin^  tha  ayatam  la  vary  aaay  to 
gal  Muck  at  aoMW  local  auntwa  By  aiartag  at  low  gam  and  tlowiy  IncraaMng  It,  wa  hava,  ao  (m,  baan  abla  to  raach  tha 
global  minitnum  OB  all  tIaglaOO-pagBlala  far  tha  pilot  Ptwdapid  natwork. 

obadn  tha  acaiat  BafBc  dIaMbiilian  for  tha  anlbu  natwork,  tha  algorithm  ahould  ba  rapaatad  for  avary  (TO  pain  (thara  ara 
Nx(N>l)  of  tham).  Aflar  tha  actual  baffle  oandllinBa  in  tha  natwork  bacoma  availabia,  tha  coat  matrix  W  can  ba  updated  by 
uMng  aquatiom  (1.1)  and  (1,2).  Tha  algortthm  la  thni  rapaatad  for  aach  00  pair  agita,  and  tha  opOmal  path  la  found  for 
aadi  00  pag  that  will  pravmtaomaliaka  bom  bacomlng  too  aowdad.  By  aorapaating  tha  algorittm,  tha  ANN  could  avantually 
obtain  tha  optimal  Bow  dlabibutlon  for  tha  natwork  in  tha  lanaa  that  tha  axpaciad  dalay  on  tha  anhra  natwotk  la  minmuzad 
for  a  pvm  aal  of  link  capagtlm  Thia  approach  wm  uaad  to  obtain  ahortaat  patha  far  mulhpia  00  pain  in  a  9-  and  16-node 
nalworka. 

Aftar  tha  pilot  dmiilaiion  wm  auccaaafully  nnmplatad.  tha  ANN  algonthm  wm  laaiadia  mulliplaOOpim  in  both  9-noda 
and  16-noda  natworka.  In  tha  f-noda  natwork,  Iha  algorithm  wmiaaiad  with  four  difiMMitOD-paiia.  Each  link  waa  amianad  to 
hava  normalixad  eapacuy  aS,  wharam  tha  Bow  on  aach  OO-pair  wm  t^un  to  ba  ai.  Tha  'opdmai*  patha  wan  obtained 
laquMioaUy  by  praaanting  to  the  ANN  one  OD-pair  at  a  tima  The  initial  oandHtona  on  A,  B,  C  n,  and  t  that  ware  usad  in 
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th«  pikM  awntUaon.  wm*  «l«o  uMd  in  thaM  amulaaons.  'nia  MUianiing  MBipanmr*  tdMduia  was  sUghdy  oiifcrcnt  uic 
uuui  UBipmiurt  wm  ktpl  dw  mbm  «  (uullaU  >  2X.  but  th*  <in«t  MMpmiun  wm  Mf  at  y^Chnal)  >  307D193  for  ail 

mate.  Aim  tht  natworii  oonvargad  to  an  'opoaial*  path  for  a  pvan  OO-pair,  tha  bnk  coat  wm  updatad  according  to  £q 
11. t)  with  p  ■  1.  Tha  four  chcaan  OO-paira  ware  A  »  (1.8),  B  ■  (2.8),  C  »  (4.2).  and  D  »  (7J)  ITha  6tat  numbar  in  me 
paranchaM  wdfeawi  tha  on^  white  tha  aaoood  tha  daatinaoonl.  For  aach  OD-pair.  convarganca  wm  achiavad  after  200 
iteranona.  ui  igraamant  widi  tha  pilot  aunuteoan.  The  initial  and  Haai  neural  acovaaona  for  each  OD-pair  are  shown  m 
Table  6.  In  Ihia  paracuter  emulation,  tha  order  that  tha  OO-pain  ware  pfMiwiad  in  tha  ANN  wm  ABCO  and  a  siagfr  pata  wm 
asaumad  to  cany  aU  (he  inttc  for  each  OO  pair.  Tha  aaow  iiuoal  coodiBona  (neural  acovaeoni  ware  used  lor  cacn  00- 
pair.  Tha  'optunal*  path  that  wm  obtainad  wm  A  ■  (1.4-S-8),  B  «  (2-S-8).  C  *  (4-1-2).  and  D  >  (7-4-5-2-3),  with  total  coat 
1 6  63M8.  Thia  path  la  not  tha  overall  optimal  path  which  hM  ooat  1S.42S10  (am  Table  7),  but  la  vary  doM  to  il 

In  order  to  daanuna  tha  affect  of  (ha  aaquanca  at  which  tha  different  OO-pain  are  praa  anted  to  the  neural  network, 
all  poMibte  parmuahona  in  the  tanuanca  of  die  four  OO-paira  waa  praaantad  and  tha  'optunal'  paths  ware  recorded.  Table  7 
suminaiuM  me  different  'optunal'  paths  and  the  frequency  they  occurrad.  Whan  the  tame  uuoal  condilians  were  uasd  for  each 
OD-pair.  the  sec  of  paths  1.  with  toal  coat  14.63948.  vary  cIom  to  the  optunal  tat  of  paths  10  with  coat  1S  42510.  wm 
obainad  44.47%  of  tte  tunas.  When  the  initial  eondiaona  (neural  aedvaBana)  fa  aach  OD-pau  were  dmacn  randomly,  the 
fraquaney  of  path  sat  I  dropped  a  29.17%.  Howava.  tha  fraqueney  of  the  opamal  path  act  10  increaaad  from  0.0%.  that  it  wm 
whan  tha  tame  inidal  eondlBona  wam  uaad,  to  1240%.  The  affect  of  dw  initial  condidona  la  cuiranily  uivestigatad.  From 
the  reaulia  obtainad  to  far,  it  appears  that  tha  different  tnitial  condidona  raault  ui  a  mote  even  diatribunon  of  the  pam 
tea  among  low  ooat  path  aas  than  tha  diambuoon  of  the  path  tats  obtained  with  fuad  uiidal  conditions.  Table  i 

sum  maruM  tha  gaiaa^iunilaiica  ha  tween  tha  aaquanca  with  whidi  the  four  OD-paiis  wet  a  piM  an  tad  to  the  network  and  the  path  set 

that  the  NN  oonvwgad  to  und«  find  initial  conditions  and  differeni  uudal  condibana.  The  numben  of  the  path  sets 
cotraapond  » tha  path  att  numban  of  TaUa  7. 

In  ordar  to  datmine  the  stability  of  tha  'optunal*  path  sets,  two  (DO-paus  weta  altemanngly  preaenttd  to  tha  NN  and 
the  path  stts  wart  recorded.  Tha  choaan  OO-pairs  ware  A  »  (1,9)  and  B  »  (84).  The  link  capaaty  wm  kept  the  same.  i.e. 

0  5  unis  par  link,  but  tha  input  date  flow  wm  raiaad  to  045  data  uiuta.  Starling  with  zaro  imoaJ  neural  aedvaoon. 
hsed  for  aach  O&'pau.  tha  NN  converged  to  a  stable  solution  in  one  ittranon.  Furtharmort,  it  converged  to  tha  same  path 
set,  irrespective  of  which  OO-pau  wm  ptManted  first  (columns  1  and  2.  Table  9).  When  the  iiudat  neural  acdvaaon  wert 
random  but  fuad  for  ail  OO-paus.  the  sNudon  wm  sabilizad  m  a  few  itctaaona.  columna  3  and  4  In  Table  9.  The  same  path 
seta  weta  obtained  iRaapaedva  of  what  OO-pau  wm  praaantad  fint.  However,  whan  different  random  uudal  aedvaoon  wm 
used  each  dma  a  new  OO-pau  wm  praaantad.  the  path  sea  subilizcd  after  a  few  praaantadons  at  slightly  different  set 
paths,  depending  on  which  OO-pau  wm  praaantad  tot  In  this  parneular  axpanment,  all  the  different  path  set  that  were 
obouned  art  equivalent  from  cost  point  of  view. 

To  MM  die  ability  of  tha  AlW  to  opomua  the  network  pcrfonnanca  further  by  oaadng  muld-path  routM  for  mulopla 
OD  pairs,  a  aomparadva  study  wm  oonducMd  by  allocating  diffarant  parccnagaa  on  tha  total  flow  on  aach  path  and  repeating 
the  algonthm  by  uiMrimving  the  different  OD  pairs  unnl  the  loiM  cnifie  from  all  OD-paus  wm  acoommodaicd.  The 
simuladona  were  conduetad  using  both  the  link  cost  { Eq.  (1.1)  ]  m  wal)  m  the  kmmrior  delay  link  cost  [  Eq.  (14)  ] 
For  a  smgla  OO  pau  but  different  parcenaga  of  traffic  allocaoon  at  each  'shortmt'  path,  the  rasulia  for  the  rwo  different 
coat  functions  applied  to  a  9-nodt  network  are  summanzad  in  Table  10  and  Fig.  4.  From  thaM  results,  a  can.  be  seen  that 
smaller  incramanai  per  iiaraoan  reauU  la  lower  total  coat,  m  general.  Furthetmote.  the  danvaove  delay  coat  tuncoon 
(14)  yields  paths  ihM  slightly  outpttform  those  obtauied  by  tha  delay  com  hincdon  (1.1)  for  moM  uiCTcmcnis.  However,  the 
diffcranom  are  not  so  aignifictnt  One  advantage  of  the  danvaove  delay  coat  funcdon  la  that  the  number  of  loops  observed 
IS  the  'shorMM'  pacha  wm  climmaMd  completely  m  tha  run  casM.  A  small  perccniagt  of  'shortMi'  paths,  usually  less  than 
5%.  occasonally  contained  loopa  when  the  delay  coat  hincoon  wm  used  instead.  Tha  eaatcnca  of  loopi  is  currently  under 
invcso^oon. 

An  idandeal  simuladon  to  the  one  dawibad  in  the  previous  paragraph  wm  conducted  for  three  OD  pairs  m  a  rune  node 
network.  Tha  ramills  for  tha  daUy  and  danvaove  delay  coat  funcoona  are  summanxad  in  Rg.a  5  and  4  taapecovely  Similar 
condusiona  to  the  auigla-path  expanmanc  can  be  drawn:  lower  indcmenia  per  ittndon  remit  in  lower  total  coat.  The 
denvadve  delay  ooM  yiatda  slightly  battv  laMilia  than  tha  delay  com  funcoon  itself.  AnalydcM  siansocs  of  the  number 
of  ooiM  each  p^  appewad  m  diffarant  iaciamenis  of  flow  were  used  to  obtain  tha  shortaM  paths  are  pven  ui  Tables  1 1  and 
12  for  100%,  i5%,  and  1%  incramanti  par  iHnoon.  Tha  amount  of  flow  each  path  camM  is  also  shown  on  the  tables.  As  it 
IS  seen,  most  of  tha  Baffle  flow  la  canotneraHd  in  a  (aw  'good'  paths  m  tha  sue  of  flow  uicnmcnt  decraasM.  Furthermore, 
the  denvadve  delay  ooM  yields  a  slightly  lower  final  com  (delay)  than  the  delay  coat 

Tha  NN  raudng  algorithm  wm  also  Mated  in  a  Ib-noda  square  gnd  network  with  link  apaoty  0.5  units,  the  same  as  m 
the  9-node  network.  Four  OO-pairs  were  used  to  mm  the  NN.  The  test  OD-paus  wetr  A  ■  (14),  B  «  (2.12).  C  *  (14.4),  D  = 
(1,13).  When  the  OD-paus  wara  praranted  to  the  NN  in  the  A8CD  sequence,  the  optimal  path  set  wm  obtained  after  2iX) 
iteraoons  for  each  OO-pau,  Table  13.  The  same  annealing  schedule  m  in  the  9-node  case  wm  uaad.  Nom  that  the  path  set  in 
Table  13  is  globally  optunal.  Due  to  space  Umitadons.  initial  condidona.  intennediaM  rcsulM  after  100  iterations,  and 
final  reaulli  aflar  200  ileradaos  arc  only  given  for  OD-pau  4.  For  the  other  three  OD-paus  only  cumulaave.  final  results 
are  given.  The  SMisIdvity  of  the  solution  to  the  order  at  which  the  different  OD-paus  era  praaantad  in  tha  NN  is  oeing 
uivasagatad.  Furthomora.  tha  appearance  of  paths  that  contain  closed  loops  which  seam  to  appaii  whm  tha  cost  of  looping  is 
low  and  tha  path  of  tha  initially  p  man  ted  OD  pau  splits  the  network  graph  uito  two  sepanM  subgraphs,  is  also  being 
uivesogated.  Additional  details  on  the  sunuladon  results  on  a  l6-nodc  network  can  be  found  in  (8|. 

V.  Conclusions 

In  this  papar,  a  naurat-baaej  computational  algonthm  Hm  been  devalopad  for  solving  optimal  traffic  routing  problems 
UI  eommumcadon  nalworka.  The  key  idea  in  ihit  algorithm  is  the  intraduction  of  pseudo  links  which  allow  the  cstcnsion  of 
any  path  to  a  length  N  path  so  that  it  can  be  represented  by  sn  NsN  neuron  array  The  proposed  NN  algonthm  can  be  used  lo 


I 


sboun  opoMl  rouMi  for  oiulnpte  OO-patn  bv  pcMNnong  to  iho  NN  on*  OO-pair  at  a  un«  laquoiQal  praMnooon  ot  OD- 
pan  ut  iba  NN  a  uiu<|ua  path  for  aadt  OO-pair.  Onct  a  ihoraai  path  m  obtainad  it  can  ba  uaao  to  cam  in* 

tnata  ntte  to  a  ^vtn  00  pan.  ptoviM  >•  capaaiy  •  aot  airaadad,  A  aian  avan  diambuooo  of  rtia  flow  iraoi 
00  pan  umg  muia-paiha  la  obtaaiatl  by  aUotaang  only  a  patoawaga  of  dia  totiU  flow  from  a  jjvbi  OD  pair  to  a 

•»h«*tot'paih.*idiapaahiig«ha^»anthwbytiit«iaav»gfliad»tofaBtOOpow.  An  nptoantnoo  ot  tha  aigonthm  uang 

inaamanaL  oicuiar  rrwwhTi —  *•*  ^  mulo^  paflia  par  OO-pair  haa  ban  taaiad  nuomcaUy  and  found  to 
raduoag  lha  raiiUf  coat  aa.  loal  daUy).  Conpun  atnulaBoa  raaulu  o«  a  lb-  nodt  pid  natwotka  ihow  that  the 
a|yyth»  (wtom  taaU  whn  *a  liopo  of  iha  noatamrty  curva  (chatacHnaois  et  lha  nauran)  la  ilowly  naaaaad  during 
itarahona.  Oia  'apwnai'  path  mm  wara  fotnd  to  bt  dom  to  toa  gtobat  optimal  and  ba  ttaMa  mdapandani  of  iha  saquoica 
tha  OO-pan  wara  ftr— iiail  to  iha  NN.  S—" laauiia  w  a  Ibtooda  Mcwetfc  uidicaH  ifaat  lha  NN  algorithm  ootmnuaa  to 
pmtom  wall  la  Urgv  natwarha.  Tha  ptotomanea  of  tha  algotuhm  in  uproalad.  mulh-ooda  narworka  with  diffamt 
coiinacnviiy  la  ptaaaaiiy  batag  avahiatad. 
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ABSTRACT 

A  nonlinear  adaptive  detector/estimator  is  introduced  for  aingie  and  multiple  sensor  data  processing.  The  problem  of 
target  detection  from  returns  of  monostatic  aenaor(s)  is  formulated  as  a  nonlinear  joint  detection/estimation  (JDE) 
problem  on  tbe  unknown  parameters  in  tbe  signal  return.  Tbe  unknown  parameters  involve  the  presence  of  the 
target,  its  range,  and  asimutb-  The  problems  of  detecting  tbe  target  and  sstimating  its  parameters  are  considered 
jointly.  A  bank  of  spatially  and  temporally  localised  nonlinear  filters  is  used  to  estimau  tbe  a  posteriori  likelihood 
of  the  existence  of  the  target  in  a  given  space-time  resoiutimi  ceil.  Within  a  given  cell,  the  localized  filters  are  used 
to  produce  refined  spatial  estimates  of  tbe  target  parameters.  A  decision  logic  is  used  to  decide  on  the  existence  of  a 
target  within  any  given  resolution  cell  baaed  on  tbe  a  posteriori  estimates  reduced  from  the  likelihood  functions.  The 
inherent  spatial  and  temporal  referencing  in  this  approach  is  used  for  automatic  referencing  required  when  multiple 
sensor  data  is  fused  together. 


1.  RANGE  ESTIMATION  FROM  COLOCATED  SENSORS 

This  section  considers  the  problem  of  localising  a  target  in  range  space  from  data  received  at  one  or  more  colocated 
sensor(s).  The  range-Doppler  space  is  partitioned  into  a  number  of  resolution  cells.  Each  cell  is  identified  with  a 
hypothesis  that  the  signal  is  present  in  it.  A  JDE  scheme  is  thm  used  to  localize  tbe  target  and  refine  its  parameter 
estimates.  The  measurements  that  are  used  to  localize  the  target  consist  of  signal  returns  corrupted  by  additive 
white  Gaussian  and  non-Gauasian  noise. 

The  problem  is  formulated  using  the  JDE  procedure  adapted  to  problems  with  uncertain  initial  conditions’*^. 
The  approach  involves  the  operation  of  aeveral  nonlinear  independent  filters  in  parallel.  In  the  case  of  Gaussian 
measurement  revise  the  extended  Kalman  filter  (EKF)  is  used  for  estimation.  An  extended  high  order  filter  (EHOF)’’ ' 
is  used  in  non-Gaussian  noise.  Tbe  parallel  filters  are  distinguished  by  the  initial  conditions  used  to  set  up  the  problem. 
Along  with  the  state  estimate  the  a  poetenon  probability  of  each  hypothesis  is  computed  recursively. 

1.1  Problem  Statement 

Consider  the  problem  of  signal  detection  and  parameter  estimation  in  the  context  of  the  reception  of  an  active  echo 
return  from  a  object  that  has  been  illuminated  by  a  monostatic  source.  The  situation  is  considered  in  which  there  are 
P  collocated  sources  that  illuminate  the  target  simultaneously,  but  with  different  carrier  frequencies  designated 
The  received  signal  at  each  sensor  is  frequency-translated  by  mixing  it  with  a  signal  at  frequency  u,,.  The  resulting 
signal  is  low-pass  filtered,  and  digitized  at  a  rate  /«.  which  is  at  least  twice  the  highest  frequency  in  the  data  The 
time  between  samples  is  denoted  It  is  assumed  that  all  sensors  have  the  same  digitization  rate,  and  that  all 
clocks  are  synchronized.  The  general  expression  for  the  received  signal  at  the  p**  sensor  ,  under  the  signal-present 
assumption,  can  be  written 

*r»  =  <»r»(T»)Pr*(n.*'»)»-,4(n,»'t)  +  Wr*  '  1) 
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whtrt  it  tb*  Ncaivad  agiMl  4inplitu<ie,  Pp^(Tk,i^)  >■  tba  pulw  abaping  ftmctioa,  and 

•■n(n.i^)»C08((y»(w,,(*l,  -  n)))-w„*«,)  (2) 

is  white  noiae  with  »  0,  ~  i)>  is  tbs  time  delay  between  signal  transmission 

and  reception,  is  a  function  of  the  range  Dk  between  the  receiver  and  the  object,  and  is  given  by 


2f>* 


(3) 


For  unambiguous  range  estimation  the  uncertainty  in  denoted  An  is  bounded  by  An  <  This  is 

due  to  the  fact  that  the  caa(.)  tunetkm  is  not  monotonic  (i.e.  ^  if  n  -  n  -  2r/(i/tw^J) 

Puin.t'k)  >>  the  pulse  shaping  function,  which  has  average  entfgy  E,. 


1.2  Joint  X>ntaetioa/£stimathHi 


In  this  section  we  describe  the  JDE  procedure  for  optimal  estimation  of  time  delay  and  Doppler  shift  assuming  the 
presence  of  the  target  has  been  detsetsd.  The  range  of  uncertainty  in  delay  and  Doppler  is  partitioned  into  a  finite 
number  of  resolution  cells.  Each  cell  is  associated  with  a  hypothesis  The  hypotheses  are  distinguished  from 
each  other  by  the  initial  conditions  on  the  initial  state  estimates  ,  Se|s,«i>  and  initial  state  covariances  Po|o.«|'  The 
measurement  and  process  models  are  the  same  for  each  hypothesis.  LM  6  9  dtieigns**  the  parameter  vector  that 
describes  the  different  initial  conditions  on  the  states.  The  parameter  vector  is  also  assumed  to  be  time  invariant. 
Urder  hypothesis  the  dkerete  time  measurements  are  modeled  according  to 

:  s*=g*(x*)+v* 

^4) 

with  i.c.’s  zofo,«f  I 


The  measurement  vector  is  compooed  of  the  scalar  measurmenta  of  the  /*  individual  sensors  such  that 


St 
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(5) 


The  state  Xs  is  common  for  all  9i  €  9,  and  satisfies  the  discrete  time  process  equation 

X*  =  f(x*_i)  +  w*_i  (6) 

The  initial  state  estimate,  the  measurement  noise,  and  the  process  noise  ate  uncorrelated.  The  process  and  measure¬ 
ment  noise  are  aero  mean  and  distributed  with  covariances  s  Qa,  and  =  Rt. 

For  each  F,-  €  B  (each  assumed  model),  a  minimum  variance  estimate  of  the  model  parameters  is  obtained  recursively 
using  the  JDE  tedinique.  Using  this  technique  a  minimum  variance  estimate  of  the  model  parameters  is  obtained 
for  every  assumed  model.  These  estimates  are  subsequently  used  to  estimate  the  likelihood  of  each  model  being  the 
correct  one.  Based  on  these  likelihood  estimates,  a  maximum  a  posteriori  (MAP)  decision  criteria  or  a  minimum 
mean  squared  error  (MMSE)  decision  criteria  can  be  used  to  select  the  proper  model. 


From  Bayes’  rule,  the  a  posteriori  probability  of  the  parameter  vector  0  is  updated  recursively  by‘~’ 


P(0\Z  1  - 

■  2±.i  P(ff«|Zs-i)p(s*|Z».j,f«) 


(7) 


where  Zt-i  =  {si.si,  ■  ■■  st_i}.  The  initial  condition  for  (7)  is  the  a  priori  probability  density  function  p(9)  = 
p(0|Zo).  which  is  assumed  to  be  known.  The  densities  p(sa|Za_i,9i)  are  updat^  using  the  EKF*  for  estimation  in 
Gaussian  noise,  or  the  EHOF’**  for  estimation  in  non-Gaussian  noise.  Since  the  state  vector  xt  is  common  to  all 
models,  the  nninimum  mean  squared  error  (MMSE)  estimate  can  be  used.  The  MMSE  estimate  is  expressed  as  a 
weighted  average  of  the  conditional  state  estmates  Xt|a,Sj  over  all  Fj  as  follows: 
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Model  (4)  cw  be  extended  to  include  the  aignnl-ebeent  eaee  (null  hypotheeis)  by  eugmentins  the  set  of  hypotheses 
(tfi)  with  the  null  bypotheeie  !«  which  hae  the  eMocieted  noiae  only  raemuicnicnt  model 

«*  *  n.  (9) 

end  renormalisation  the  a  prion  distribution  P{ii,  i  >  0, 1.  •  ,  M,  when  ia  the  number  of  rewlution  cells. 
1.3  Speciftention  of  Initial  Conditioaa 

The  localised  initial  conditioas  for  each  reaolution  cell  are  defined  as  foUowa:  Let  the  time  delay  have  mean  tq  and 
density  function  p^(Tb).  The  distribution  of  rig  is  segmented  into  N  nonoverlapping  segments  such  that  the  segment 
around  some  localised  initial  eatimate  f.,,  is  defined  by 


We  have 


Define  the  sealing  parameters  Ca  sudi  that 

Cl  /  Pn,a(<')dr  *  1  1  <  n  <  Af 

Then  the  mean  and  variance  of  the  initial  eonditiona  of  the  segmented  model  are  given  by 

^•0  *  =  C.  T>r»a(r)dr 

Var{r,J  s  <« 

With  Af  different  initial  conditions  on  rb  there  are  Af  different  reacdution  cells  for  referencing  the  measurements.  A 
different  filter  is  initialised  in  each  resolution  cell.  The  total  number  of  cella  in  the  resolution  space  can  be  large, 
depending  on  the  desired  accuracy  in  the  parameter  resolution.  However,  the  filters  can  be  run  in  parallel,  and 
independent  of  each  other,  thus  reducing  the  execution  time  to  that  of  a  single  filter. 

The  parameter  vector  1  <  t  ^  Af ,  is  defined  to  be  the  t**  resolution  cell  and  is  used  to  define  Af  initial  conditions 
on  the  state  vanables  r.  The  a  priori  probabilities  of  each  hypothesis  are  determined  by  integrating  the  density 
function  prgfm)  and  over  the  limits  defi^  for  each  hypothesis.  They  are  given  by 

P(«i)~  Pro{r)iir  (11) 

1.4  Joint  Detection/Estinaatkm  of  Timn  Delay 

This  section  addresses  the  model  in  which  the  state  zt  =  is  unknown  and  to  be  estimated.  The  parameter  vector 
8i  is  defined  as  before.  Hypothesis  Hi  is  now  given  by 


with  initial  conditions 


V*  kt,  <  Ti 

+  »*  h<kt,<r%+t, 

ht.  >n+f. 

=  [f-of 

Poto,0i  =  [Var(n.j] 
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^j(#k)  s  0.5(1  -  coi(iwi^(kU  -  h)/t,)) 

where  it  is  observed  that  Um  amplitude  functioo  a,j(.)  reflects  the  traasmitted  amplitude  A  attenuated  by  iperical 
spreading  loss. 

1.5  Experimental  Evaluation 

Both  single  and  double  senaot  models  (P  s  1,  and  /*  a  3)  in  (5)  eete  selocted  for  experimental  evaluation.  For 
this  evaluation  the  sampling  frequency  was  /.  s  100  x  10*  Bs.  the  puke  width  was  set  to  121.  and  e,  the  speed  of 
propagation,  was  180000  miks/asc.  For  ail  teste,  the  nominal  time  delay  and  Doppler  were  rpon  -  0  000324  and 
(i^om  ~  1)  *  8-®®  *  10”^  leapectivriy,  eorreaponding  to  a  target  at  a  nominal  range  of  10  miles,  traveling  at  300 
mph  Doppler  velocity. 

It  was  assumed  that  the  error  in  the  time  delay  estimate  was  '*niformly  distributed  at  ±3.5 1,  about  the  nominal 
delay.  The  corresponding  variance  is  then  (7 1,)*/13.  The  error  the  Dopier  estimate  was  assumed  to  be  uniformly 
distributed  at  ±7.47  x  10“^  about  the  nominal  Doppler.  Thk  corresponds  to  an  error  in  the  Doppler  velocity  of 
±250  mph.  The  corresponding  variance  k  1.85  x  10~'^. 

1.5.1  Single  Sensor  Evaluation 

The  single  sensor  model  wan  used  to  compare  the  use  of  nniltipk  fllters  (^f  s  7)  to  a  single  filter  (Af  =  1)  for 
JDE.  With  only  one  filter,  ioio.Si  *  friom.  P910.I1  *  C^**)*/!*.  “  <kseribed  previously.  The  initial  estimates  of 
time  delay  for  the  multiple  filter  implementation  are  given  by  »  (»  -  4)  •  *•  +  ritom.  "  *  1.2,  •7.  Thus, 

the  initial  delay  estimates  were  separated  by  t,,  with  Var(nt,)  *»  li/12,  Vn.  The  a  priori  probabilities  are  given  by 
»  1/JV,  1  <  «  <  N. 

The  Monte  Carlo  rimulation  lesolts  for  JDE  with  a  singk  filter  (N  *  1)  and  a  bank  of  seven  filters  =  7)  are 
shown  in  Figure  1(a).  In  this  figure  the  mean  squared  error  (BdSE)  of  the  astimation  error  in  n  ••  shown  as  a 
function  of  SNR.  where  SNR  m  10  log(E,/e;),  for  t»  <  *  t,  <  n  nnd  E,  k  the  average  received  signal  energy  ^ 
per  sample.  Ea^  point  on  the  graph  represents  the  results  of  500  simalation  runs.  Both  the  MAP  and  MMSE 
estimates  are  shown  in  Figure  1(a).  The  MAP  and  MMSE  estin^  we  the  same  for  fif  »  1.  Also  shown  on  this 
graph  are  the  results  for  the  detwtion-oaly  (D-0)  technique,  which  k  implemented  by  fixing  the  estimates  at  their 
initial  values.  The  noke  k  Gaussian,  and  the  EKF  k  used  to  perform  estimstioii  in  the  JDE  method.  The  JDE 
(JV  =7)  implementation  gives  better  results  than  the  D-0  method,  particularly  at  higher  SNR.  ThU  is  expected 
since  the  filter  in  the  JDE  method  allows  a  eonsiderabk  refinement  estimates  at  higher  SNR  as  compared  to  low 
SNR  where  the  larger  noise  covariance  restricts  the  filter  gain.  At  -5  dB  SNR  the  JDE  and  D-0  implementations 
perform  identically.  In  general,  the  MMSE  estimates  are  better  than  the  MAP  estimates,  particularly  at  low  SNR’s. 
The  JDE  (Af  =  1)  implementation  gives  the  worst  overaU  performance.  The  filter  used  in  this  implementation  often 
converges  to  poor  final  esUmates  due  to  the  tendency,  mentioned  previmaly,  of  time  delay  to  converge  to  values  that 
are  separated  from  the  actual  time  delay  by  multiples  of  ±l//«, 

The  JDE  (AT  =  7)  technique  k  evaluated  in  lognormal  noke  in  Figure  1(b)  for  the  single  sensor  model.  The  MMSE 
estimates  of  are  shown  in  thk  figure  for  the  EKF  and  for  the  EHOF-  The  EKF  k  evaluated  in  two  configurations. 
In  the  first  configuraUon,  the  Gaussian  pdf  k  used  to  evaluate  the  deteetkm  statkUc  g;v«  b,  equation  In  the 
second  configuraUon,  the  lognormal  pdf  k  used.  The  EHOF  k  evaluated  usiag  the  lognormal  pdf  “‘y  The  ^kf 
in  the  second  configuration  and  the  EHOF  give  very  sinailw  results  at  low  SNR.  However,  at  high  SNR  the  EHOF 
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outpcrfomu  the  EKF.  When  the  GeuMian  pdf  ia  uaed  in  conjunction  with  the  EKF  to  locelize  the  target,  the  retulta 
•re  significantly  woree  than  when  the  proper  lognormal  pdf  is  uaed.  This  advantage  ia  particularly  evident  at  low 
SNR’a. 

1.5.2  Double  Sensor  Evnluetion 

In  the  multiple  sensor  ease  (P  >  1),  the  sensors  may  have  different  carrier  frequencies  and  different  translation 
frequencies  {up, ).  A  two>aenaor  (P  s  2)  model  was  evaluated  in  which  w,,  =  2r  •  10  x  10*.  w,,  =  2r  •  30  x  10*.  and 
wii,  s  wi,  3  0.  The  MMSE  results  of  this  evaluation  for  JDE  (iV  =  7)  are  given  in  Figure  1(c).  The  single-sensor 
(f*  3  1)  MMSE  results  ate  also  shown  in  this  figure.  This  figure  illustrates  the  distinct  advantage  of  centralized 
fusion  for  JDE. 

1.5.3  Multipin  Pulan  Proenssing 

The  results  of  processing  two  pulaas  are  given  in  Figure  1(d).  The  EKF  and  EBOF  are  configured  such  that  the 
initial  error  covariance  is  reset  at  the  beginning  of  each  pulse.  Ute  rationale  for  this  is  to  re-excite  the  system.  This 
helps  to  allow  poor  estimates  to  possibly  converge  to  smaller  errors,  and  it  has  been  shown  experimentally^,  that  it 
does  not  significantly  effect  those  estimates  that  have  already  converged  close  to  the  actual  state  value.  Figure  1(d) 
shows  an  improvement  of  about  3  dB  for  the  two  pulse  estimate  over  the  single  pulse  estimate. 


2.  RANGE  AND  AZIMUTH  ESTIMATION  FROM  NONCOLOCATED  SENSORS 

Consider  the  situation  of  two  spatially  separated  setuore,  si  and  s2.  Each  of  the  two  sensors  attempu  to  detect  and 
track  objects  coming  into  its  respective  area  of  coverage.  The  coverage  of  the  two  sensors  is  assumed  to  overlap  in 
apace,  but  not  entirely.  The  sensor  geometry  ia  shown  in  Figure  2.  In  the  overlap  region  the  data  received  by  the  two 
sensors  can  be  combined  to  get  a  more  accurate  estimate  of  target  parameters  or  to  estimate  parameters  that  cannot 
be  estimated  with  one  sensor  alone.  In  the  overlap  region  the  estimates  from  the  individual  sensors  are  combined  to 
form  improved  target  parameter  estimates.  We  coiuider  the  esse  where  each  of  the  sensors  may  have  different  types 
of  tracking  devices  su^  as  optical  trackers,  various  types  of  rsdars,  etc.  It  is  assumed  that  these  sensors  transmit  a 
signal  and  process  the  echo  returned  from  that  signal.  The  signals  are  corrupted  by  additive  Gaussian  noise  due  to 
thermal  effects  within  the  receiver,  and  by  clutter  which  may  be  due  to  non>Gaussian  distortion  such  as  sea  clutter 
or  other  multipath  spreading.  Typical  distributions  used  to  model  this  distortion  include  the  Rayleigh,  Weibull  or 
lognormal  distnbutions^.  The  thermal  r'>ise  at  the  receiver  is  assumed  to  be  uncorrelated  from  sensor  to  sensor. 

2.6  System  Model 

Assume  that  each  sensor  consists  of  a  phased  array  or  some  other  sensing  device  that  can  produce  target  angle  esti¬ 
mates  along  with  estimates  of  time  delay  and  Doppler  shift.  It  is  assumed  that  there  are  two  separate  measurements 
taken  at  each  sensor  •  one  measurement  at  each  of  the  offset  phase  centers.  The  received  signal  at  the  p'^  sensor 
may  be  described  by 

*fs  =«»* +»*s* +>*  (16) 

where  gp^  represents  the  received  signal,  is  the  clutter,  and  ypp  is  the  Gaussian  noise  at  the  sampling  interval. 
Since  there  are  two  measurements  observed  at  each  sensor,  the  received  signal  can  be  more  explicitly  expressed  as 

(.7) 

UmJ  LvtJ 

Two  unknown  delays,  and  r^i,  are  introduced  in  the  received  signal  gp^.  The  delay  r^i  is  the  round-trip 
propagation  time  from  the  center  of  the  sensor  to  the  target  and  back  to  the  sensor.  Referring  to  Figure  3,  this  is 
the  time  for  the  signal  to  travel  from  point  Pp  to  O  and  back  to  point  Pp.  F^om  Vpi  the  range  to  the  target  can  be 


determioed  using  tbs  relstiooship 


where  e  it  the  speed  of  propsgstion.  The  deity  r,j  it  the  diflmace  in  time  for  the  tigntl  to  retch  from  point  P^i 
to  point  P,j.  The  difference  in  the  proptgttion  disttace  it  given  hy  er^j.  The  dilferentitl  tngle  A4p  to  the  ttrget 
from  sensor  p,  which  represents  the  difference  between  the  tensor  pointing  tngle  tnd  the  tctutl  ttrget  tngle  4p, 
is  then 


4t  =  dto 

where  dp  it  the  disttnce  between  the  two  offset  phtae  centers  in  the  phased  array  for  sensor  p. 

2.6.1  Single  Observer  Model 

Using  estimates  of  tnd  from  one  sensor  the  ttrget  position  can  be  estimated  through  the  relations  (18,  tnd 
19).  Define  the  state  variable  vector  for  sensor  p  at  Xp^  —  .  It  it  assumed  that  the  state  does  not  change 

while  the  pulse  is  being  reflected  from  it.  Therefore  the  process  dynamics  ate  seto;  that  is,  the  state  transition  matrix 
is  unity  and  there  is  no  process  noise.  In  terms  of  the  state  variables  the  received  signal  at  the  p‘^  tensor  is 

.  (*rs  ) . 


=  0-5* (1  - eot(2rvp(kt,  - 1,^(1)  +  *j*n(2)/2)/l,,)) 

'Hi  =  -  *>*(*)  +  *i*r*(2)/2))) 

for  j  s  1,2.  Kj  s  4.1  whenever  j  s  1.  Hj  s  -1  whenever  y  =  2.  is  the  doppler  velocity  (assumed  known  in  this 
case),  A  is  the  transmitted  amplitude,  and  reflects  attenuation  due  to  spherical  spreading  loss.  The  definition 
of  pp)^(  )  given  above  represents  the  Hanning  pulse  type  with  pulse  width  t^p.  The  EKF  equations  for  the  constant 
state  model  given  above  are  given  by 

+  'll?)" 

^r»is  -  (22) 

**|t  *  xs_i|s.i  +  Kp^  ip^ 

*i>s  ~  *rs  ~  •s*(*s|s-i) 

where  is  the  measurement  covariance,  Kp^  is  the  filter  gain,  and  is  the  Jacobian  of  the  measurement 
model^'*.  The  EHOF  incorporates  3'^  and  4'*  order  estimation  error  and  measurement  error  moments.  However, 
the  equations  are  very  lengthy  and  are  not  presented  here. 

2.6.2  Double  Observer  Model 

When  information  is  available  from  two  sensors,  that  is,  whenever  the  target  is  in  the  overlap  region,  and  the  target 
is  illuminated  simultaneously  by  the  two  radars,  the  Doppler  and  time  delay  estimates  from  each  sensor  can  be 
combined  to  obtain  a  better  estimate  of  target  position  and  velocity. 

Let  X'  and  Y"  denote  the  directions  of  a  local  coordinate  system  as  shosrn  in  the  insert  in  Figure  2.  Let  and 
dsg,  the  pointing  angles  of  the  two  sensors,  be  chosen  such  that  dso  ^  ^>0  ~  20 deg.  In  this  case  the  direction  X' 
poinu  directly  along  the  line  of  sight  (LOS)  of  sj,  and  perpendicular  to  the  LOS  of  st.  Likewise,  Y'  points  directly 
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klong  th«  LOS  of  <1  ud  porpoadicular  to  Um  LOS  of  «).  X'  ia  Um  ia>track  diiwtioo  for  «i  and  the  croaa-track 
direction  for  •].  V"  ia  it  in-track  direction  for  aj  and  the  croaa-track  direction  for  ai .  For  amall  aaglea  such  that 
sin( 0),  the  poaition  eatimataa  in  the  X',  Y'  coordinate  ayatem,  which  can  be  found  from  either  sensor,  are 
given  by 

d^,  =  -(rfit/2-0,o) 

*("u/2- Ao) 

^s'j  ~ 

where  Oro  ia  the  nominal  range  from  aenaor  p  to  the  center  of  the  inaert  in  Figure  2.  The  aaaociated  position  error 
variances  are  given  by 

*  Djj^c*Var(nsl/d? 
oj,  w  e*Var(»^i]/4 
'si  *  c*Var(Tii]/d 
=  f^c*Var{eMl/d| 

If  it  ia  assumed  that  the  time  delay  estimation  errota  have  Gaussian  distributions,  then  the  maximum  likelihood 
estimates  of  the  target  poeition  in  the  overlap  region  D^,  which  are  the  weighted  sums  of  the  estimates  at  each 
sensor,  are  given  by 

DijCTij/di  -  <r^  (cfii/2  -  Dj„) 

s  —2 - jL  J  (25) 


(ctii/2  -  Oio)  + 


2.7  Joint  Doteetion/Eatinintion 

The  target  search  region  has  been  localised  to  the  rectangular  box  shown  in  Figure  2.  This  box  is  subdivided  into 
several  resolution  cells  as  shown  in  this  figure.  The  beam  pattern  from  sensor  st  allows  this  sensor  to  detect  a  target 
and  estimate  its  parameters  if  the  target  is  located  in  resdution  cells  1  through  21.  Sensor  sj  can  delect  the  target 
if  it  is  in  celb  11  through  15,  22  through  25,  or  20  through  31.  If  the  target  is  not  located  in  any  of  these  cells 
then  the  target  is  declared  not  present  (or  more  precisely,  not  detectable)  .  This  situation  is  represented  by  the  null 
hypothesis  Hq.  The  resolution  ceUs  are  grouped  into  regions  which  will  be  used  for  minimum  mean  squared  error 
estimation.  If  the  target  is  located  in  regiona  Ri  (resolution  ceils  1  through  9)  or  As  (resolution  cells  16  through  21) 
only  sensor  S|  can  detect  the  target.  Regions  A|  (resolution  cells  22  through  25)  and  As  (resolution  cells  26  through 
31)  correspond  to  the  coverage  area  of  sensor  sj  only.  If  the  target  ia  located  in  region  Aj  (resolution  cells  10  through 
15)  both  sensors  can  detect  the  target  and  perform  parameter  estimation.  The  remaining  area  in  the  rectangle  in 
Figure  2is  designated  as  region  Ao,  where  nether  sensor  can  detect  the  target. 

Let  6i  €  erdeugnate  the  parameter  vector  that  describes  the  different  combination  of  model  uncertainty  and  initial 
condition  uncertainty.  The  parameter  vector  0i  is  assumed  to  be  time  invariant.  The  parameter  vector  9i,  1  <  i  <  56 
is  defined  to  be  the  i**  resolution  cell  and  ia  used  to  define  56  different  combinations  initial  conditioiu  and  models,  i 
corresponds  to  the  range  resolution  cell  number  determined  from  the  initial  conditions  on  the  two  time  delays  from 
each  sensor. 

In  general,  hypothesis  Hi,  representing  the  hypothesis  that  the  target  is  located  in  resolution  cell  i,  is  defined  by 
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(28) 


(30) 


wh«rt  ud  j  m  1,3,  tipt— t  the  noa^Miaua  ud  Gmihub  acim,  roqMctivvIy,  praeat  at  tlw  p'*  Miiaor 
In  regions  Ri,  Rj,  sad  Ra,  where  eeaaor  si  can  detect  the  tarfet,  the  eompooeat  is  defined  by  (30)  u 

otherwise 

In  the  regions  Ao,  At  and  A|,  gim^  s  0,  V  h,  m  s  1,3.  The  delay  is  given  by 

(29) 

where  Hm  >  -fl  whenever  m  s  1,  and  ««,  »  -1  whenever  m  s  3.  la  regions  Ai,  At,  sad  At,  where  tensor  sj  can 
detect  the  target,  the  component  is  defined  by  (30)  a 

{eam»(.)pjm4(.)rj*j(.)  fw*.  <  **.  <  h-s.  +<•, 

„  * 

0  otherwiee 

In  the  regions  Aq,  A|  and  As,  PsMt  ^  ^  -  ^•2- 

The  initial  conditions  are  given  by 

•so|0.#t  * 

%o.tt 

The  initial  estimates  s  1,3  are  chosen  such  that  the  position  at  the  target  for  a  signal  received  ate 

sensor  p  is  at  the  center  of  rc^ution  cell  i.  The  variances  Varff^i^^}  and  Var(f^St^]  determined  based  on  a 
uniform  distribution  of  the  error  within  the  cell. 

Define  Zt  =  (si,S3,  ■■■  ss],  whereat  s  {sf^,s^^]^,u  the  set  of  ^measurements  up  to  time  h,  andletp(st|Zt-i.8i) 
be  the  probability  density  function  of  at  given  the  meesurements  Zt-t  and  bypotheeia  H{.  The  a  posteriori  probability 
of  hypothesis  Ni  is  given  by 

miZt)  »  ■  (32) 


(31) 


where  At(st)  is  the  likelihood  ratio  defined  by 


E]Le/»(dZt.t)A,-(it) 


"^•^“^'RTtizt-.:#;) 


(33) 


The  minimum  mean  squared  error  estimate  can  be  found  be  combining  the  estimates  from  all  of  the  cells  with  a 
particular  region.  If  the  state  vector  xt  is  common  to  all  models  the  minimum  mean  squared  error  (MMSE)  estimate 
can  be  used.  The  MMSE  estimate  for  sensor  p  in  region  Ar  can  be  expressed  by 

*;»,**  r  (34) 

cellieiir 

The  most  likely  region  is  selected  using  the  MAP  criterion.  Define  as  the  hypothesis  that  the  target  is  located  in 
region  Ar  as  /r,  r  s  0, 1,  •  ■  • ,  5.  The  a  posteriori  probability  aasociated  with  region  Ar  is  the  sum  of  the  a  posteriori 
probabilities  of  all  of  the  cells  in  that  region.  This  region>levd  probability  is  given  by 

A(/r|Z*)*  P{9i\Zk)  (35) 

cellicju 

The  most  likely  region  is  chosen  according  to 

Choose  Ir 


argniaxr^.....,,,,€*  ^(^r|Z*) 


(36) 
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2.T.1  D^nitioo  at  Prion 


Th<  a  priori  probabUitin  of  aacb  hypotbana  an  baaed  m  the  ana  coverage  of  the  nnaon.  The  total  number  of 
rcaolution  eeUi  sbowa  ia  Figun  2  ia  M.  Of  tbeae,  25  an  located  ia  ngioa  Ao-  All  cella  an  aaaumed  to  have  equal 
probability  of  coataiaiag  the  target.  The  a  priori  probabilitiat  an  given  by  P(do)  «  25/50,  P(d,-)  s  1/50,  i  = 
1,2,  •  31.  The  probabilitka  aaeociated  with  ragioaa  A,,  r  *  0,1,  •••,  5  arr  given  by  P(Io)  =  25/50,  P{Ii)  = 
9/50,  P(h)  *  0/50,  P(I»)  »  0/50,  P(U)  »  4/50,  P(h)  «  0/50. 

2.8  Siimilatioa  BapwiaMota 

An  experimental  study  waa  conducted  to  evaluate  the  performance  of  the  multi-aenaor  fusion  technique.  In  this 
evaluation  the  measurement  noin  consisted  of  50%  Lognormal  Noin  and  50%  Gaussian  noin.  The  nominal  angles 
from  senaon  st  and  S}  to  the  target  wen  dip  «  45  deg  and  dip  *  135  deg,  mpectively.  The  nominal  range  from 
Si  to  the  target  waa  D\  s  10  miln.  The  nominal  range  frem  nnaor  S)  to  the  target  Dj  waa  chonn  such  that  the 
received  signal  at  si  was  5  dB  higher  than  at  si  for  the  same  transmitted  signal  level  and  target  stnngth. 

The  carrier  frequencin  used  hy  the  two  sensors  wen  the  same  at  /.  s  10  x  10*.  Both  sensors  sample  the  signal 
at  a  rate  /,  x  100  x  10*.  and  both  signals  have  the  same  puln  width  x  12//«,  p  x  1,2.  The  resolution  cell 
width  is  1//.  seconds.  The  aaeociated  initial  error  variance  on  time  delaye  nip  and  rn,  is  tJ/12.  The  corresponding 
range  resolution  cell  width  is  Ar,  x  e/(2/«).  Thus,  the  initial  variance  for  the  angle-measurement  delays  is  (19) 
V"(nio]  =  ((dfe)/(2/,Dp))*/l2,  p  x  1,2.  the  separation  batmen  phaae  centers  at  the  sensor  was  chosen  to  be  3 
feet  for  each  senaor.  Simulations  wen  performed  for  SNR’s  (at  sensor  si)  ranging  from  -lOdB  to  lOdB.  500  random 
target  poeitiona  wen  chosen  at  each  SNR.  Of  these  500  trials,  228  target  positions  randomly  chosen  in  region  Ap,  91 
in  Ai ,  54  in  A],  44  in  Aa,  40  in  Ap,  and  40  in  As.  The  resulta  given  hen  an  for  monopuln  proceasing(i.e.  one  pulse 
repetition  interval  (PRl)). 

The  probabilities  of  missed  detection  P(/o|/r)  and  correct  classifiration  (i.e.  not  only  detection  of  the  target  but 
correct  localisation  at  the  region  level)  P(/r|/r)  ,  r  x  1, ....h  an  displayed  in  Thble  1.  The  probability  of  mis- 
claasification,  which  is  not  shown  in  this  table,  ia  given  by  P(Ii\I,)  x  1 P(lr\lr)  -  P(/o|/r)>  9  4  r.  Sensor  sj 
outperforms  nnsor  n,  which  is  to  be  expected  since  the  SNR  at  si  is  5  dB  higher  than  the  SNR  at  sensor  $j  In 
the  overlap  ngion.  A),  the  elaasiScation  performance  ia  better  than  ia  any  other  region,  with  an  85%  probability  of 
correct  classification  at  — 10  dB  SNR.  Additional  numerical  results  have  been  generated*'*  with  complete  probability 
of  detection  (PD)  and  probability  of  fain  alarm  (PFA).  What  appean  as  a  discnpancy  in  P{Ir\lr)  at  -5  dB  SNR 
for  r  X  2, 3, 4  is  due  to  statistical  error  due  to  small  sample  sise. 

Table  1.  Probabilities  of  Missed  Detection  and  Correct  Classification  •  Region  Level 
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Tht  ettimktioa  nntlU  «k  aiiown  ia  Figure  4.  All  tenlu  shown  in  this  dgure  ste  in  reference  to  the  (X',Y') 
coordinnU  sysUm.  Figure  4(n)  shows  the  average  mean  squared  wror  Cor  thoM  detections  in  regions  Ai  and  Ri,  in 
which  only  $i  haa  coverage.  Figure  4(e)  shows  similar  results  for  regions  R4  and  Rt,  which  are  covered  by  sensor  sj. 
Figure  4(c)  also  illustrates  the  5  dB  performance  for  sensor  S)  over  that  for  S|.  Figure  4(b)  shows  the  results  for  both 
sensors  in  region  Rj.  In  this  region,  as  shoem  in  Table  3  the  proper  cell  ia  almost  always  found.  Thus  the  cross-range 
estimation  error  variance  should  improve  by  about  8  dB  (201o^2))  Cor  sensor  S),  sirce  the  cross-range  error  for  sj 
has  been  localised  'rom  2  cells  down  to  1.  Similarly,  the  cross-range  error  variance  Cor  sensor  S|  in  Region  Rj  is 
reduced  by  about  10  dB  (201og(3))  since  the  target  has  been  localised  from  3  cells  down  to  1.  This  improvement  is 
evident  in  Figure  4(b).  Figure  4(d)  shows  the  estimation  lesults  using  the  combined  measurents  obtained  from  (25. 
26).  Because  of  the  larger  variance  in  the  rroes  range  error  for  each  sensor  and  the  fact  that  the  intersection  of  the 
LOS's  between  the  two  seasors  are  perpendicular,  the  combined  estimate  ronsists  ai  the  X'  estimate  from  sensor  sj 
and  the  Y*  estimate  from  sensor  si. 


i 


a.  CONCLUSION 

A  model-based  adaptive  detection/estimation  approach  has  been  presented  for  multi-sensor  fusion.  It  is  shown  that 
excellent  performance  can  be  obtained  for  both  target  detection  and  target  parameter  estimation  using  this  technique. 
A  significant  advantage  of  this  technique  is  that  each  sensor  can  perform  detection  and  parameter  estimation  in  a 
decentralized  mode.  The  final  estimates  and  a  posteriori  prob^ilitics  from  each  sensor  are  processed  by  a  centralized 
processor  to  derive  the  optimum  estimate.  The  method  provides  an  automatic  referencing  mechanism  of  the  data 
from  the  different  sensors  (automatic  data  alignment)  as  long  as  the  geometry  and  timing  of  the  sweeping  beanu  are 
known.  For  optimal  target  resolution  performance,  it  is  found  that  the  lines  of  sight  of  the  two  sensors  should  be 
perpendicular  to  each  other  at  any  given  time,  requiring  special  synchronisation. 
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ABSTRACT 

Two  different  types  of  adaptive  networks  are  considered  for  solving  the  centralised  and  distributed  hy¬ 
pothesis  testing  problem.  The  performance  of  the  two  different  types  of  networks  is  compared  under  different 
performance  indices  and  training  rules.  It  it  shown  that  training  rules  baaed  on  the  Neyman-Pearson  criterion 
outperform  error  based  training  rules.  Simuiatioiu  are  provided  for  data  that  are  linearly  and  nonlinearly 
separable. 


I.  INTRODUCTION 

The  optimum  Bayesian  and  Neyman-Pearson  solution  to  the  distributed  decision  fusion  problem  bears 
striking  similarities  to  the  structure  of  a  neural  network  (NN),  128.29).  Moreover.  NNs  can.  in  principle  learn 
arbitrary  input-output  mi^pingt,  provided  that  they  are  sufficiently  smooth.  These  two  facts  motivate  the 
use  of  NNs  for  solving  the  centralised  and  distributed  hypothesis  testing  problem.  In  selecting  the  proper 
NN  layout,  one  could  argue  that  a  perceptron-tyrpe  NN  can  learn  any  input-output  mapping,  thus  it  can  be 
trained  to  solve  the  hypothesis  testing  problem.  However,  the  ability  of  a  perceptron-type  NN  to  learn  an 
arbitrary  I/O  mapping  critically  depends  on  the  number  of  layers,  the  number  of  neuroiu  per  layer,  and  their 
interconnections  which  cannot,  in  gencraL  be  determined  a  priori. 

In  order  to  conduct  a  comprehensive  study  of  the  ability  of  adaptive  networks  to  solve  the  centralised 
and  distributed  hypothesis  testing  (CHT  and  DKT)  problem,  two  different  types  of  adaptive  networks  are 
considered:  structured  adaptive  networks  (SANs)  and  perceptron-type  neuron  networks  (PTNNs).  By  SAN 
we  mean  a  network  whose  inputs  are  functionally  related  to  the  data  through  known  functional  traitsforma- 
tions.  and  the  outputs  arc  parametrically  dependent  on  the  input.  By  PTI^  we  mean  a  multi-layered  NN 
that  consists  of  neuroru  in  the  classical  sense,  intercoimccted  through  synaptic  weights. 

The  selected  networks  are  trained  using  error  based  and  Neyman-Pearson  based  indices  of  performance 
(IPs).  The  training  rules  arc  derived  as  gradient  rules  on  the  selected  IPs.  Simulations  are  conducted  with 
linearly  and  nonlinearly  separable  Gaussian  data. 


II.  Centralised  Bayesian  Hypothesis  Testing  (CBHT) 

Assuming  N  statistically  independent  data  sources,  the  optimal  Bayesian  or  Neyman-Pearson  (N-P) 
CBHT  it  the  Likelihood  Ratio  Test  (LBT) 


. . 


dlA] 


where  designates  the  data  from  the  i-th  sensor.  Si  is  the  i-th  hypothesis,  t  =  0, 1.  The  threshold  7/.  for 
the  Bayesian  processor  in  determined  by 
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wtot  Fo,  Pi  s  1  -  Pn  art  tlw  prion  on  tilt  two  hypothtiti  and  C,,  is  the  cost  of  deciding  in  Uror  of  hjpothesit 
B,  wha  the  true  hypothesis  is  Ej,i,j  s  0, 1.  For  the  N-P  soiution,  the  threshold  7/  is  deiennin^  by  tbe 
false  aiam  retpurenaent  at  the  fusion  according  to 


dP(A(r)|Jo) 


<  Oo 


where  oo  is  the  desired  aggregate  probability  of  false  alarm  (PFA)  at  the  fusion.  Notice  that  the  Bavesias 
processor  retpures  the  knowledge  of  the  priors  (Po<P])  which  may  not  be  objectively  available.  The  N-P 
processor  drenmvents  this  requirement  by  constraining  the  PFA  and  the  probability  of  detection 

IPD).  Also  notice  that  both  processors  are  parametric. 


m.  Distributed  Binary  Hypothesis  Testing  (DBHT) 

Auuming  that  each  sensor  makes  binary  or  multi-level  independent  decisions  u,,  t  =  I . N,  the  optimal 

Bayesian  or  N-P  DBHT  solution  under  statistical  independence  consists  of  multilevel  likelihood  ratio  quant:- 
titers  (L£Qt)  il2,18j  at  each  tensor  and  an  LET  at  the  fusion.  For  binary  LEQ  at  each  tensor  j4  to  19  and 
22  to  31 1  with 

-i-1,  if  the  i-th  local  decision  favors  hypothesis  H  i ; 

-1,  if  the  i-th  local  decision  favors  hypothesis  Bo 

for  the  i-ih  tensor,  the  optimal  Bayesian  or  N-P  DBHT  takes  on  the  form 

S'  H, 

-i-t.)  ^  tf 

i«l  Ho 


{iii.i) 

{in-2) 


where 


to,  = 


PdM-Pf,) 
Pfi(l  -  Pd.) 


and 


Pp.d-Pp.) 

-  Pf.) 


(7/7.3) 


The  threshold  t  r  for  the  Bayctiaa  DBHT  is  determined  by  an  expression  similar  to  (II.2)  that  depends  on  the 
pnors  (Po<Pi)-  For  the  N-P  DBHT  the  threshold  (/  it  determined  by  the  PFA  requirement,  equation  (11.3). 
It  it  interesting  to  notice  that  (IIL2)  can  be  written  as 


Y,  -to  ^0  (777.4) 

•Si  Ho 

where 

.V 

=  (777.5) 

ISI 

The  form  of  (111.4)  is  reminitcent  of  4  NN,  iigores  1  ezid  2,  !28.29!. 


rv.  Centralised  Hypothesis  Testing  and  Distributed  Decision 
Fusion  with  Structured  Adaptive  Networks  (SANs) 


test 


A.  Centralised  Binary  Hypothesis  Testing  with  SANs 

As  discussed  in  Section  IL  the  optimal  decision  test  for  a  binary  hypothesis  problem  is  a  likelihood  ratio 
(LET)  of  the  form 


A(r) 


p(r,Ho)  </ 


(7V.1) 
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whtrt  is  tht  conditimud  probability  .ensity  function  (pdf)  of  the  data  conditioned  on  H,  (i=0.1 1  and 

if(>  0)  is  a  tbmbold.  For  Gaussian  probicnos,  iniA(r)i  has  a  simpler  form  and  can  be  used  in  lieu  of  (FV'-I) 
in  the  equivalent  iog-L&T 


lnlA(r)i  =  ^  7  :=  !«{»>) 

ip(riiro)J 


For  example,  if  the  problem  is  of  the  form 
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where  lV(m,(r^)  indicates  a  Gaussian  pdf  with  mean  m  and  variance  then  the  log-LRT  test  &om  (IV.2) 


l(r)=  4-^  r'-J  mi  -  ,  J  2,  ^ 
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where  l(r)  is  the  sufficient  statistic  for  the  problem  (IV.3).  The  previous  example  serves  as  motivation  for  the 
structure  of  the  network  that  is  discussed  in  the  following  section. 


1.  Network  Structure 

The  structure  of  the  network  is  shown  in  Fig.  3.  The  functions  di  ere  chosen  to  reflect  any  a  priori 
knowledge  about  the  problem.  In  the  Gaussian  problem  for  example,  in  view  of  (FV.t),  it  is  natural  to  take 


di(z)  =  z  ,  »  =  0, 


(/V.5) 


with  k  =  2.  In  the  general  case  k  >2.  Note  that  in  a  general  problem,  the  d,’s  can  assume  different  functional 
forms.  From  figure  3,  the  output,  of  the  network  due  to  the  data  Vj  is  given  by 


y;  =  #  ill 


(/V-6) 


where  g(.)  is  a  sigmoid  function  defined  as 


y(*)  = 


1  *  e-*' 


where  A  >  0  adjusts  the  steepness  of  its  slope.  The  network  of  figure  3  is  capable  of  decision  making,  if  one 
maps  y  >  0  to.  say,  Sj. 

Given  the  above  network  structure,  the  hypothesis  testing  problem  takes  on  the  following  form;  given  a 
set  of  0,  ’s,  i  =  0, 1, ...,  k.  and  a  set  of  observations  r  along  with  the  hypotheses  under  which  they  are  generated, 
choose  the  coefficients  e,,t  =  0,1,..., Je,  so  that  the  resulting  decision  scheme  is  close  to  the  optimal  one  in 
some  suitably  defined  sense.  It  is  therefore  necessary  to  establish  a  criterion  of  optimality  and  an  aigontom 
that  updates  the  weights  e„s  s  0,1,...,  it,  in  order  to  meet  this  criterion.  The  second  task  is  the  so  called 
training  of  the  network.  In  the  sequel  we  discuss  two  different  performance  criteria  and  derive  the  update 
equations  for  the  parameters  of  the  network  (syiuqitic  weights)  for  each  one  of  them. 

The  first  criterion  which  appears  more  intuitive  especially  in  view  of  the  backpropagation  method  20  .  is 
to  minimise  the  sum  of  the  squares  of  errors  over  all  the  training  data.  In  this  case,  the  index  of  performance 
(IP)  can  be  defined  by 


(/X'.8) 
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where  .V  is  the  aamber  of  onulahie  training  data  (tjrpieaUy  around  50-100  per  hTpothetis)  and  is  the  error 
defined  bv 

e>(‘)  :=  F,(0  -  If".  J  (/V.9) 

where  yf  it  equal  -rl  if  it  generated  under  or  -1  if  it  it  generated  under  Ho-  Note  that  the  time  index 
is  introduced  to  denote  updates  of  the  weight*  c,.  Since  (IV.S)  does  not  impose  any  penalty  on  the  relative 
magnitudes  of  the  weights,  a  natural  extension  of  (FV.S)  it 

V  ,  k 

ym  \  naO 


where  Pn  >  0  are  suitably  chosen  weighing  coefficients.  Under  (IV.8)  or  (I\M0),  the  network  will  approximate 
k  minimum  probability  of  error  classifier,  i.c.  will  minimise  the  probability  of  error  given  by 

Pe  =  Pr{H, \Ho)Po  -r  PriHolHi )i>,  (A'.ll) 


where  P,j.P\  are  the  prior  probabilities  of  the  respective  hirpotheses.  In  this  case,  the  training  will  try  to 
’'fit”  the  model  (A'.6)  to  the  training  data  to  that  the  sum  of  the  square  errors  it  minimised.  Although  this 
approach  seems  natural,  it  is  not  suitable  for  hypothesis  testing  problems  for  two  reasons.  First,  the  network 
that  minimises  (IV.S)  or  (I\M0)  for  a  given  training  set  it  not  asymptotically  optimal  as  the  volume  of  the 
available  training  data  goes  to  infinity  simply  because  even  if  Pe  can  be  made  to  be  very  close  to  sero  for  a 
given  training  set,  (for  example  by  taking  k  s:  N)  the  network  may  not  result  to  Pe  dote  to  the  probability 
of  error  of  the  LAT  over  the  entire  data  ensemble.  (Note  that  since  hata  may  be  generated  by  either 

hypothesis,  =  0  is  not  always  possible.)  On  the  other  hand,  if  k  is  kept  moderate,  fitting  it  very  difficult 
especially  when  the  data  under  both  hypotheses  arc  doscly  dustered  as  in  the  Gaussian  cate  when  the  pdf's 
under  the  two  hypotheses  have  the  tame  mean  and  comparable  variances.  An  additional  problem  with  the 
training  rule  (IV.S)  or  (HMO)  it  the  lack  of  a  general  stopping  criterion  for  the  training.  From  the  discussion 
above.  (R’.S)  and  (FNMO)  are  not  satisfactory  criteria  for  our  problem,  although,  they  result  in  acceptable 
performance  in  linearly  separable  data  cases  as  it  shown  in  the  simulations  section. 

The  second  criterion  used  for  training  is  based  on  the  Neyman-Pearson  (N-P)  approach  which  maximises 
the  probability  of  detection  at  a  given  (fixed)  false  alarm  probability  level.  The  key  difference  between  the 
N-P  and  the  the  least  squares  error  approach  it  that  in  the  N-P  training  the  hypotheses  are  separated  and 
enter  separately  in  the  performance  index.  For  this  method,  the  performance  index  is  given  by 

J(t)  =  PM{t)  *  ^[Pr{t)  -  Pr,f  {p  >  0)  (J\\l2) 

where  Pf.  is  the  preset  level  of  false  alarm  probabihty  and  Pm,  Pe  are  defined  by 


h,{t)  := 


2 


{IV.12) 


PHt)  := 


2  w  -  e;.. 


(fV.U) 


and  are  approximate  expressioiu  for  the  miss  probability  Pm  and  the  false  alarm  probability  Pe  of  the  net¬ 
work  respectively.  For  a  large  sample  site  and  large  A,  the  expression  on  the  RfiS  of  (IV.13)  and  (IV.U) 
approximate  the  PmW  and  Pf  (t)  of  the  network.  In  view  of  (rV.12),  the  training  in  this  case  should  compute 
the  weights  c,,  i  =  0, ...,  k,  that  minimise  J  for  the  given  training  set. 
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1b  foUowiag,  for  tacli  of  tht  Above  optimAlity  chtcri*,  ««  derive  the  updote  eqnotioiu  for  the  (svsAp- 
tic)  »eifhu. 


2.  Gradioat  Update  Laws 
The  derivative  of  §(x)  it  givwi  by 


ne  tiise  derivative  of  J{t)  from  (FV.b)  is 


M*)  =  7 


2Ac 


1  -r 


J«1  '  '  JMl  ^  ^ll»0  "  J  ■' 


from  which  it  is  dear  that  if 


(/V.15) 


(/V.16) 


{IV.ll) 


we  have  that 

naO  }al 

which  implies  that  J  it  decreasing  for  as  long  as  the  network  docs  not  reach  an  equilibrium  point.  A  simple 
first  order  update  expression  for  the  weights  follows  directly  from  (r\M7)  and  from  the  fact 


|^=p/(|;c.p.{r,))d,(r,) 


and  hat  the  following  form 


,(1  +  1)  =  c„(«)  -  (oAt)  ^ 


(/V.18) 


(/r.l9) 


where  n  s  0, 1 . k. 

For  (FV.IO),  in  a  similar  manner,  the  recursion  update  laws  are  given  by 


C,{t  +  1)  =  (1  +  P„&t)Cn(t)  -  (oAt) 


jal  \iaO  / 


(/r.2o) 


which  results  in  significant  improvement  on  performance  and  rate  of  convergence  as  found  from  simulations. 
For  the  DP  given  by  (IV.12),  the  derivation  of  the  update  equations  it  as  follows: 


Uting  the  chain  rule,  we  obtain 


^  SPm  dcw  dPp  _  ^  dPf  de„ 
dt  ^  dcn  dt  '  dt  dc„  dt 

naO  naO 


(n^2i) 


(7T'.22) 
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^3^ 


The  p«rti«i  dcnvuivti  in  (rV'.22)  arc  given  by  the  ezprcitione 


dPs, 

(/r.23) 

dCn 

♦ 

dPF 

I 

( TV 

dCn 

2  fV-V-p- 

» 

where  as  before 


Hence  the  gradient  update  rule  is  given  by 


dc„  dPM  -  p  ^8Pf 


which  results  in  the  following  iterative  update  expression  for  Cn 

e„(t  -  1)  =  c„(f)  -  (uAl)  -  p(Pf  - 


(/r.25i 


(/r.261 


(;t'.2:) 


which  in  view  of  (n’.23),  (R'.24)  is  a  so-called  batch  training  method  since  all  training  data  are  required  for 
each  update. 

In  the  remainder  of  this  section,  we  compare  the  performance  of  the  above  training  methods  for  two 
hypothesis  testing  problems. 

3.  Simulation  Reeults:  The  Centralised  Case 

The  diiTerent  hypothesis  testing  paradigms  were  selected  in  order  to  compare  the  performance  of  SANs  in 
linearly  and  nonlinearly  separable  data  ensembles  under  the  MS£  and  N-P  training  rules.  The  performance 
was  benchmarked  with  respect  to  the  sue  of  the  training  data  ensemble,  the  number  of  power  terms  («,  s  )  m 
the  functional  representation  of  the  data,  and  the  training  rule. 

The  two  selected  problems  for  centralised  and  distributed  hypothesis  testing  were; 

(i)  a  Linear  Gaussian  Problem  (LGP) 


(ii)  a  Quadratic  Gaussian  Problem  (QGP) 


/1  +  1V(0,1)  -.By 
~\N{0,1)  :Bo 

fiV{0,5)  :Bi 
:Bo 


{LGP^ 


{QGP' 


where  N{m.o'^)  is  the  Gaussian  distribution  with  mean  m  and  variance  For  each  problem,  the  optim&i 
LRT  test  follows  directly  from  (IV.4). 

In  all  cases,  both  the  mean-squared-error  (MSE)  rule,  eq.  (W.S),  and  the  Neyman-Pearson  (N-P)  rule. 
eq.(nM2),  were  used  to  train  the  SANs.  The  simulations  were  conducted  as  follows.  The  number  of  coef¬ 
ficients  were  fixed  to  either  three  (k=2)  or  six  (ks5).  Experiments  with  samples  of  one  hundred  (fifty  per 
hypothesis)  and  two  hundred  (one  hundred  per  h3rpothesis)  data  points  were  performed.  The  initial  N-aiue 
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of  the  c,  coefficients  wu  mo  in  all  simolations.  For  the  MSE  training,  selective  training  was  used  to  avoid 
convergence  problems  that  arise  during  training  from  data  that  belong  to  different  hypotheses  but  are  ’'met- 
tically”  close.  According  to  the  selective  rule,  at  each  training,  corrections  were  made  only  over  those  data 
points  that  were  identified  as  belonging  to  the  correct  hypothesis  at  the  beginning  of  the  session. 

An  arbitrary  stopping  rule  was  also  used  to  terminate  the  MSE  training  when  the  gradient  was  less  than 

10"‘ 

N-P  training  was  performed  at  different  PFA’s.  The  post-training  Receiver  Operating  Characteristics 
(ROCs)  were  obtained  by  keeping  all  the  c,  coefficients  fixed  at  their  training  values  and  varying  the  threshold 
(«o)- 

The  ROCs  were  experimentally  obtained  by  running  ten  thousand  data  points  (five  thousand  per  hypoth¬ 
esis)  through  the  SAN  but  excluding  the  data  points  used  for  training.  For  each  problem-  we  selected  the 
coefficients  that  corresponded  to  the  value  of  the  PFA  which  generates  the  experimental  ROC  with  the  larger 
area  when  tested  on  the  training  data.  For  the  LGP.  the  N-P  training  method  outperforms  the  error  traiiung 
method.  This  is  also  the  case  for  the  QGP.  The  simulation  results  for  both  problems  are  summarized  in  Table 
1  for  the  error  training  and  Table  2  for  the  Nevman-Pearson  method  respectively. 

Some  conclusions  drawn  from  the  simulations  follow. 

1)  The  N-P  training  method  outperforms  the  error  based  training  method.  This  is  clear  from  the  QGP 
where  the  data  under  the  two  hypotheses  are  not  well  separated  spatially  as  in  LGP,  in  which  the  data  are 
clustered  around  the  two  well  separated  means. 

2)  If  the  model  is  overparameterised.  the  performance  of  the  NP-trained  SAN  is  sensitive  to  the  value  of 
For  example  in  the  (QGP),  the  performance  is  good  for  Pfc  =  0.7  and  poor  for  Pr^  =  0.2.  At  a  result 

one  should  try  several  values  of  Pp^  and  choose  that  one  for  which  the  ROC  (obtained  from  testmg  on  the 
training  data  after  training)  gives  the  ROC  with  the  largest  area.  Furthermore,  one  could  also  start  with  a 
low  value  for  k  (say  k  =  2)  and  keep  increasing  its  value,  choosing  finally  the  ROC  with  the  largest  area. 

3)  In  general,  N-P  training  results  in  a  SAN  that  performs  close  to  the  optimum  test.  Since  no  a  priori 
knowledge  for  the  pdfs  is  necessary,  this  is  a  powerful  approach  especially  in  the  case  in  which  the  volume  of 
the  available  data  is  not  sufficiently  large  for  a  reliable  estimate  of  the  pdfs  under  each  hypothesis. 


B.  Distributed  Decision  Fusion  with  N-P  Rule  Trained  SANs 

1.  Network  Structure 

The  fusion  system  in  Fig.  12.  which  consists  of  three  identical  sensors  interconnected  in  parrallel  was  used 
to  test  the  performance  of  N-P  trained  SANs  in  data  and  decision  fusion  problems.  In  the  centralized  data 
fusion  test,  each  sensor  in  the  configuration  of  Fig.  12  simply  relays  its  observations  to  the  fusion  directly. 
The  fusion  is  replaced  by  a  SAN  similar  to  the  one  shown  in  Fig.  3.  Thus,  the  centralized  data  fusion  SAN  is 
identical  to  the  one  discussed  in  the  previous  section,  except  that  three  data  are  available  at  a  time,  instead 
of  a  single  measurement  as  in  the  case  of  single  sensor  SAN. 

In  the  distributed  decision  fusion  (DDF),  each  sensor  in  the  configuration  of  Fig.  12  is  replaced  by  a 
SAN  similar  to  the  one  Fig.  3.  Due  to  the  similarity  of  the  sensors,  it  is  assumed  that  a  symmetric  solution. 
i.e.  identical  svnaptic  weights  and  thresholds  among  all  three  sensors  results  in  a  solution  that  is  close  to  the 
optimal  one.  u  not  "the  optimal".  Under  the  assumption  (or  constraint)  of  identical  operating  points,  the 
structure  of  the  optinoal  DDF,  eq.  (VI.2),  simplifies  to 


with  the  convention. 
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r  1  if  the  i-th  local  decision  favors  hypothesis  H\ 
\  0  if  the  i-th  local  decision  favors  hypothesis  Ho 
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Notice  that  the  numerical  values  associated  with  each  sensor  decision  are  mereiy  an  expressionai  conve¬ 
nience  and  do  not  play  any  role  in  the  outcome  of  the  fusion  process  (see  Section  V  as  well). 

Given  the  structure  of  the  optimal  DDF  equation  (rV'.28)  in  the  S3rmmetric  case,  the  only  variables  that 
determine  the  performance  of  the  fusion  for  a  target  false  alarm  probability  arc  the  thresholds  at  the  sensors 
(conomon  among  all  sensors)  and  the  fusion  threshold.  Thus,  in  the  SAN  implementation  of  the  symmetric 
DDF  only  two  parameters  are  adaptively  adjusted:  the  common  threshold  for  ail  sensors  and  the  fusion  thresh¬ 
old.  This  structure  was  used  for  training  the  SAN  to  perform  the  DDF  for  the  fusion  system  of  figure  12  using 
the  N-P  training  rule.  However,  N-P  training  of  the  network  by  varying  the  two  thresholds  simultaneously 
resulted  in  very  poor  performance  of  the  fusion.  Thus,  instead  of  training  all  the  tensors  simultaneously  by 
minimising  the  N-P  performance  index  at  the  fusion,  the  ROC  of  each  sensor  was  obtained  separately  using 
N-P  training  first.  Then,  the  fusion  rule  was  fixed  a  priori,  and  the  network  ROC  was  obtained  by  varying 
only  the  common  threshold  at  the  sensors  after  they  were  trained. 


2.  Simulation  R«aulta 

In  order  to  compare  the  performance  of  the  centralised  hypothesis  testing  with  the  DDF  using  the  SAN. 
the  same  two  binary  htrpothesit  testing  problems  that  were  used  for  testing  the  performance  of  SANs  in  CBHT 
were  also  used  for  DBHT.  The  simulations  for  all  problems  were  performed  as  follows:  In  all  cases,  the  size  of 
the  training  set  is  not  larger  than  200  data  points.  Post-training  testing  it  performed  on  at  least  2000  data 
points  other,  of  course,  than  the  training  data  points.  The  initial  value  of  all  c,’t  is  zero.  Due  to  the  training 
rules  that  implement  a  true  gradient  decent,  convergence  is  monotonic  in  all  cascs.The  values  of  the  weights 
after  training  for  each  case  arc  given  in  Table  2. 

The  DDF  was  done  by  pretraining  each  tensor  with  the  test  set  individually  using  N-P  training.  To 
implement  the  ROC  of  each  tensor,  a  SAN  with  two  terms  in  the  power  expansion  {K  =  2)  was  used  .  For  the 
LGP  case  I,  Table  2.  the  training  set  coiuists  of  50  data  points  per  hypothesis.  The  network  was  trained  using 
1000  iterations  and  the  N*P  trainixig  rule.  For  the  QGP,  100  data  points  per  hypothesis  were  used  for  training, 
case  3,  Table  2.  Since  all  the  seruors  are  assummed  to  be  identical  and  operating  at  the  tame  operating  false 
slam  and  detection  probability  point,  the  synaptic  weights  ( coefficients  c,)  for  the  DDF  for  all  three  of  them 
arc  identical,  and  identical  to  the  weights  used  for  hypothesis  testing  by  each  one  individually,  Table  2. 

In  all  DDF  cases,  the  sensors  were  assumed  to  be  identical,  all  operating  at  the  tame  PFA  and  Pd-  The 
"OR",  "AND”,  and  the  ”ML”  (majority  logic)  rules  were  used  for  decision  fusion.  The  ROC  of  the  different 
fusion  rules  for  the  DDF  arc  compared  among  themselves  and  with  the  centralised  fusion  ROCs  in  Figs.  13. 
14.  The  following  conclusions  can  be  drawn  from  these  flguret. 

In  the  LPG,  the  majority  rule  seems  to  give  ,  e  best  ROC  for  DDF,  which  it  close  to  the  SAN  perfor¬ 
mance  on  the  centralised  hypothesis  testing.  For  the  QGP,  however,  the  OR  rule  teems  to  yield  the  best 
ROC.  which  again,  is  close  to  the  centralised  ROC.  A  general  conclusion  from  the  numerical  results  teems  to 
be  that  for  linear  separable  data,  the  majority  fusion  rule  yields  the  best  ROC.  However,  for  quadratically 
separable  data,  the  OR  fusion  role  yields  the  best  ROC. 

V.  Distributed  Decision  Fusion  with  Fereeptron>Type  Neural  Networks 

Although  the  form  of  the  optimal  Bayesian/N-P  DDF  is  known,  for  both  binary  and  multi-level  quantiza¬ 
tions  l9,12.14j,  the  optimal  thresholds  are  given,  in  general,  in  terms  of  coupled,  nonlinear  equations  i8',  |10!. 
whose  solution  it  not  forthcoming  even  in  simple  cases.  Suboptimal  numerical  soLutions  to  the  N-P  DDF  '10 
may  still  be  computationally  intensive,  if  the  fusion  rule  it  unknown.  The  optimal  solution  to  the  Bayesian 
and  Neyman-Pearson  DDF  problem,  eq.  (in.4)  bears  striking  topological  and  functional  similarities  with  the 
structure  of  a  neural  network  (NN).  This  topological  similarity  suggests  an  alternative  approach  to  solving 
the  computationally  N-P  hard  [5]  DDF  problem.  By  slightly  modifying  the  values  that  designate  the  decision 
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for  aoutioiuJ  conTtniacc,  the  optisul  BaytsiaB  sad  N‘P  DDF  rule  (in.4),  take*  on  the  form 


Hi 


I  Hfi 


vhefc 


•ad 


w,  =  log 


Pd^ 

Pf. 


t,  =  log 


-  log 

I -Pd. 


I- Pd: 

^-Pf.. 


1-Pf. 


{V.l) 


{V.2} 


(r.3) 


(V.4) 


By  combiaiag  the  coiutaat  thneholds  together  with  the  aakaown  opermtioxul  thteehold  Tj  «ad 

Wo  :=  -Tf  ^  X  ‘‘  (''-5) 


the  DDF  mie  (\'.2)  c«a  be  whttea  in  •  form  rexaiaisceot  of  na  NN  nrchitectarc; 

_  «. 

«;o  +  ^  Witt.  <  0  (V.6) 

I  Wo 

A  aotieeable  «dvaatngc  of  (V.6)  over  (V.2)  it  that  the  jnkaowa  threshold  Tt  has  been  absorbed  in  the  synaptic 
weight  wqi  which  can  be  detcrmiaed  through  traiaiag  by  assnsiing  that  it  corrcspoadt  to  the  interconnection 
weight  of  an  addisionaU  cotuumt  input  to  the  fosion  ncoron.  Notice  that  the  threshold  in  (V.6)  it  known, 
coastant,  and  equal  to  sero.  Thus,  (V.6)  can  be  implemented  by  using  an  NN  and  replacing  the  hard  threshold 
decision  rule  by  a  smoother  sigmoidal  nonlinearity  (20,21.  Nils  *90,  TPS  *901. 

In  figure  1  the  optimal  Bayesian  (N'P)  DDF  structure  it  shown  when  the  local  LR  it  linear  on  the  data. 
If  the  (local)  tensors  and  fusion  in  figure  I  are  identified  erith  neurons  and  the  thresholds  are  replaced  by  con¬ 
tinuous  sigmoid  functions,  there  it  a  one-to-one  topological  correspondence  between  the  D-S  DDF  architecture 
and  the  simple,  two  layer  NN  of  figure  2.  The  topological  similasities  suggest  that  one  can  take  advantage  of 
the  learning  capabilities  of  an  NN  and  train  it  to  solve  the  Bayesian  DDF  even  when  the  channel  tiatittics 
are  not  known.  The  solution  to  Bayesian  DDF  can  be  achieved  by  using  any  one  of  the  available  training 
rules.  For  example,  if  a  quadratic  error  is  defined  at  the  fusion  by  squaring  the  difference  between  the  actual 
hypothesis  and  the  output  of  the  fusion,  a  gradient  based  algorithm,  tu^  as  backpropagation  120..  can  be 
ttsed  to  update  the  synaptic  weights,  i.e.  the  coefficients  of  the  LBJs  in  the  Bayesian  DDF. 

Ikaining  of  the  NN  with  a  quadratic  error  criterum  wiB  result  in  a  mminMm  trror  computer,  if  trained 
properly.  A  quadratic  error  training  attempts  to  fit  the  data  in  two  different  hypotheses  by  Tninitniting  « 
distance  criterion.  However,  if  the  data  in  the  training  set  are  mmcrically  close  under  the  two  hypotheses, 
^'^vtranttng  of  the  NN  in  order  to  achieve  perfect  disersnination  of  the  data  in  the  training  set  will  result  in 
poor  post-training  performance.  To  avoid  performance  degradation  from  overtraining,  selective  training  has 
been  used  with  exceiknt  results.  The  drawbacks  associated  with  overtraining  in  the  quadratic  error  criterion 
can  be  avmded  by  using  an  N-P  based  optimality  criterion,  such  as  the  minimisation  of  the  mitt  probability 
at  the  fusion  for  fixed  false  alarm  probability.  Such  a  training  eriterion  resulu  in  an  NN  that  implements  the 
optimal  N-P  DDF.  If  the  optimal  Bayesian  DDF  is  highly  nonlinear,  an  NN  with  inputs  pol3momial  functions 
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of  the  dat«  (polTnomial  oetwork)  can  be  used  to  solve  the  optimal  Bayesian  DDF.  This  approach  correspondt 
to  approximating  the  LET  by  a  truncated  Taylor's  scries  expansion  or  a  Voltera  series  to  the  approach 

used  in  SANs,  figure  3,  for  determining  the  coefficients  for  each  power  in  the  T.S.£. 


I 


^5/ 


A.  Training  B.uJ«a 

1.  Backpropagation  based  on  mean-squared  error 

Let  the  training  output  of  the  network  be  tt^  at  the  n-th  iteration,  while  the  training  hypothesis  ii  u”. 
The  backpropagation  method  trains  the  NN  by  TninimUing  the 

error  energy  =  ^(uo  -  «.")*•  (I'-T) 

n 


where  the  summation  is  over  all  training  data  during  a  training  cycle.  To  implement  a  true  gradient  descent 
usmg  the  nomencluture  of  the  generalised  delta  rule  120],  define  for  each  neuron  k  the  function 

Ik  =  01,(1  -  Ok)  ^  ikfOkj  (V’.g) 

all  j  that  *  leads  to 
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Ar; 


» 


where  o,  is  the  output  of  neuron  j  and  Wkj  is  the  current  weight  between  node  k  and  node  ;  .  The  output 
node  is  a  special  case  where 

f„  =  2«  -  urK(i  -  <)  (r.9) 

The  update  of  the  weights  during  training  is  done  using  the  difference  equation 

dw"j  =  qi^o,  f  adw^~ ’ ,  ( ^  1 0 ) 

where  q  and  a  are  predefined  constants  that  determine  the  rate  of  convergence.  The  second  term  in  the  weight 
update  equation  it  known  as  the  momentum  term. 

The  NN  that  was  used  for  DDF  consisted  of  three  identical  sensors  and  a  fusion.  Each  sensor  was 
represented  by  an  identical  NN,  each  having  one  input  neuron,  one  hidden  layer  with  three  netirons.  and  a 
single-neuron  output  layer.  The  fusion  NN  consisted  of  three  input-layer  neurons,  three  hidden-layer  neurons 
and  a  single-neuron  output  layer.  The  NN  was  first  trained  on  the  LPG  and  QGP  of  the  previous  section. 

Backpropagation  was  used  to  train  the  three  layer  neural  network  to  perform  DDF.  The  test  for  conver¬ 
gence  was  based  on  the  criterion 
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Training  was  terminated  when  the  criterion  (V.ll)  was  satisfied. 

• 

2.  Training  based  on  Neyman-Pearaon 

N-P  training  is  conceptually  identical  to  the  backpropagation  algorithm,  except  that  training  is  done 
around  a  desired  false  alarm  rate  at  the  fusion.  In  order  to  achieve  training  around  a  desired  false  alarm  rate 
a  at  the  fusion,  two  possible  performance  criteria  can  be  used  to  measure  the  output  error:  ^ 

E^Pm^  X{Pr  -  q)*  (T'.12) 
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or 
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(^M3) 


whore  Pa/>  Pf  ore  the  miti  end  fhlsc  aUna  probabilitic*  at  the  fusion. 

The  modiAcationt  required  to  the  standard  backpropagation  to  implement  the  N-P  fusion  riile  relate  only 
to  the  energy  function  derivative  with  respect  to  the  output.  To  get  this,  first  we  express  the  probabilities  in 
terms  of  the  output  as 
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which  give  two  possible  derivative  forms 
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for  (V.12)  and  (V.13)  respectively.  If  we  set 
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where  ”0”  designates  the  output  neuron,  then  the  backpropagation  rule  proceeds  as  described  above.  The 
update  rule  (V.IO)  with  So  defined  by  (V.18)  implements  a  true  gradient  descent  training  by  batch-processing 
the  training  set,  whereas  the  backpropagation  with  So  defined  by  (V.9)  implements  a  ’’pteudo"-gradient  de¬ 
scent.  A  pseudo-gradient  back  propagation  with  the  N-P  energy  functions  (V.12)  or  (V.13)  did  not  manage 
to  produce  a  suitably  trained  NN.  However,  the  true  gradient  N-P  training  rule  (V.18)  was  successfully  used 
in  training  the  NN  to  solve  the  DDF  problems. 
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S.  Training  baaed  on  Kalman  Filter 

The  problem  of  training  a  NN  can  be  viewed  as  a  Kalmsm  Filtering  problem  [231.  If  the  ideal  (unknown) 
weights  and  thresholds  of  the  NN  are  identified  with  the  state  z(n)  of  a  Kalman  Filter,  then  these  weights  * 
should  be  time-invariant,  thus  satisfying  the  plant  equation. 

*(n-rl)  =  *(n)  (V.19) 

The  unknown  state  c(n)  in  the  NN  is  observed  via  the  nonlinear  output  equation  ^ 

d(ii)  =  h(»(n)) -(- v(n)  (r.20) 

where  the  error  made  from  not  knowing  the  weights  and  thresholds  precisely  is  modeled  as  sero  mean,  random 
error  v(n)  with  covariance  matrix  £[v(n)v(n)^]  =  A(n),  a  positive  definite  matrix.  The  nonlinear  function  h(.) 
takes  into  account  all  the  threshold  nonlinearities  at  each  neuron  at  every  iajrer.  From  the  nonlinear  Kalman  ^ 

Filter  theory,  the  state  z(n)  can  be  estimated  using  the  Extended  Kalman  Filter  (EKF)  with  equations 

z(n-(-  1)  =  z(n)  +  K{n)[d{n)  -  h(z(n))]  (V.21) 
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K(n)  =  /»(n)ir(n)[ii(fi)  +  jff^(n)P(n)J(n))-‘  (l'.22) 

P(fi  +  1)  =  P(n)  -  K{n)B^{n)P{n)  (K23) 

whtre  B{n),j  is  the  derivative  of  the  outpot  t  with  respect  to  weight  j,  computed  as  in  the  backpropagation. 
Also  d(  ft)  is  the  desired  vector  outpot  neurons.  For  more  details  on  the  use  of  the  EKF  for  training  the  NN 
to  perform  the  DDF  sec  (22] 


B.  Simulation  Roeulta 

The  input  data  for  each  NN  sensor  were  generated  from  the  LGP  and  QGP  distributions  that  were  used 
to  benchmark  the  SANs.  The  results  are  shown  in  figures  15  through  18.  For  the  LGP  one  hundred  training 
points  were  sufficient  to  obtain  a  ROC  close  to  the  optimal  DDF.  However,  for  the  QGP,  one  thousand  sam* 
pie  points  were  required  to  obtain  acceptable  ROC.  If  the  solutions  of  the  error  baaed  backpropagation  are 
compared  with  the  N>P  baaed  backpropagation,  it  it  teen  that  the  later  results  in  superior  performance.  Yet 
if  the  results  from  the  perceptron-tjrpe  NN  are  compared  sritb  the  N-P  trained  SAN,  figures  13  and  14,  the 
later  results  in  superior  performance  with  considerably  fewer  data  samples,  in  particular  for  the  QGP.  (200 
points  for  SAN  vs  1000  pomtt  for  PTNN).  However,  it  should  be  stressed  that  no  separate  pretraining  of  each 
sensor  NN  was  required  with  BPTNN,  at  was  required  for  SANs  in  order  to  perform  DDF. 

Overall,  SANs  have  the  advantage  that  their  performance  can  be  understood  and  interpreted  analytically 
since  they  are  by  construction  parametric  approximation  to  the  LR  optimal  fusion  rules.  For  the  PTNNs. 
such  an  interpretation  is  not  forthcoming,  limititig  the  extrapolation  of  conclusions  based  on  limited  training 
data  sett  to  general  classes  of  problems. 

VI.  SUMMARY 

Natural  structural  similarities  between  the  Bayesian  DDF  solution  and  adaptive  networks  are  exploit¬ 
ed.  It  it  shown  that  structured  adaptive  networks  (SANs)  and  perceptron-type  neuron  networks  (PTNNs) 
can  learn  to  solve  centralised  and  distributed  hypothesis  testing  problems  efficiently,  even  in  the  absence  of 
explicit  statistical  information  about  the  data,  provided  that  the  proper  training  rule  and  procedure  are  fol¬ 
lowed.  Two  trauung  rules  are  invisstigated:  a  mean  squared  error  (MSB)  based  rule,  and  a  rule  based  on  the 
Neyman-Pearson  (N-P)  test.  Under  both  training  ml«.  the  post-training  performance  of  the  network  it  very 
comparable  to  the  optinoal  likelihood  ratio  test  (LRT).  However  the  N-P  rule  trained  networks  outperforms  the 
MSB  rule  trained  network,  even  when  selective  training  it  used  with  the  later.  The  behavior  of  the  networks 
under  the  two  training  rules  it  studied  extensively  in  hypothesis  testing  problems  with  linearly  and  nonlinearly 
separable  data.  Similarities  and  differences  in  the  behavior  and  performance  of  the  networks  are  discussed. 
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